Cloned DNA sequences related to the entire genomic RNA of human immunodeficiency virus II (HIV-2), polypeptides encoded by these DNA sequences and use of these DNA clones and polypeptides in diagnostic kits

ABSTRACT

A method for diagnosing an HIV-2 (LAV-II) infection and a kit containing reagents for the same is disclosed. These re-agents include cDNA probes which are capable of hybridizing to at least a portion of the genome of HIV-2. In one embodiment, the DNA probes are capable of hybridizing to the entire genome of HIV-2. These reagents also include polypeptides encoded by some of these DNA sequences.

[0001] This application is a continuation-in-part of U.S. patentapplication Ser. No. ______ of Alizon et al. for “Cloned DNA SequencesRelated to the Entire Genomic RNA of Human Immunodeficiency Virus II(HIV-2), Polypeptides Encoded by these DNA Sequences and Use of theseDNA Clones and Polypeptides in Diagnostic Kits,” filed Jan. 16, 1987,which is a continuation-in-part of U.S. patent application Ser. No.931,866 filed Nov. 21, 1986, which is a continuation-in-part applicationof U.S. patent application Ser. No. 916,080 of Montagnier et al. for“Cloned DNA Sequences Related to the Genomic RNA of the HumanImmunodeficiency Virus II (HIV-2), Polypeptides Encoded by these DNASequences and Use of these DNA Clones and Polypeptides in DiagnosticKits,” filed Oct. 6, 1986 and U.S. patent application Ser. No. 835,228of Montagnier et al. for “New Retrovirus Capable of Causing AIDS,Antigens Obtained from this Retrovirus and Corresponding Antibodies andtheir Application for Diagnostic Purposes,” filed Mar. 3, 1986. Thedisclosures of each of these predecessor applications are expresslyincorporated herein by reference.

BACKGROUND OF THE INVENTION

[0002] The invention relates to cloned DNA sequences analogous to thegenomic RNA of a virus known as Lymphadenopathy-Associated Virus II(“LAV-II”), a process for the preparation of these cloned DNA sequences,and their use as probes in diagnostic kits. In one embodiment, theinvention relates to a cloned DNA sequence analogous to the entiregenomic RNA of HIV-2 and its use as a probe. The invention also relatesto polypeptides with amino acid sequences encoded by these cloned DNAsequences and the use of these polypeptides in diagnostic kits.

[0003] According to recently adopted nomenclature. as reported inNature, May 1986, a substantially-identical group of retroviruses whichhas been identified as one causative agent of AIDS are now referred toas Human Immunodeficiency viruses I (HIV-1). This previously-describedgroup of retroviruses includes Lymphadenopathy-Associated Virus I(LAV-I), Human T-cell Lymphotropic Virus-III (HTLV-III), andAIDS-Related Virus (ARV).

[0004] Lymphadenopathy-Associated Virus II has been described in U.S.application Ser. No. 835,228, which was filed Mar. 3, 1986, and isspecifically incorporated herein by reference. Because LAV-II is asecond, distinct causative agent of AIDS, LAV-II properly isclassifiable as a Human Immunodeficiency Virus II (HIV-2). Therefore,“LAV-II” as used hereinafter describes a particular genus of HIV-2isolates.

[0005] While HIV-2 is related to HIV-1 by its morphology, its tropismand its in vitro cytopathic effect on CD4 (T4) positive cell lines andlymphocytes, HIV-2 differs from previously described human retrovirusesknown to be responsible for AIDS. Moreover, the proteins of HIV-1 and 2have different sizes and their serological cross-reactivity isrestricted mostly to the major core protein, as the envelopeglycoproteins of HIV-2 are not immune precipitated by HIV-1-positivesera except in some cases where very faint cross-reactivity can bedetected. Since a significant proportion of the HIV infected patientslack antibodies to the major core protein of their infecting virus, itis important to include antigens to both HIV-1 and HIV-2 in an effectiveserum test for the diagnosis of the infection by these viruses.

[0006] HIV-2 was first discovered in the course of serological researchon patients native to Guinea-Bissau who exhibited clinical andimmunological symptoms of AIDS and from whom sero-negative or weaklysero-positive reactions to tests using an HIV-1 lysate were obtained.Further clinical studies on these patients isolated viruses which weresubsequently named “LAV-II.”

[0007] One LAV-II isolate, subsequently referred to as LAV-II MIR, wasdeposited at the Collection Nationale des Cultures de Micro-Organismes(CNCM) at the Institut Pasteur in Paris, France on Dec. 19, 1985 underAccession No. I-502 and has also been deposited at the British ECA CCunder No. 87.001.001 on Jan. 9, 1987. A second LAV-II isolate wasdeposited at CNCM on Feb. 21, 1986 under Accession No. 1-532 and hasalso been deposited at the British ECA CC under No. 87.001.002 on Jan.9, 1987. This second isolate has been subsequently referred to as LAV-IIROD. Other isolates deposited at the CNCM on Dec. 19, 1986 are HIV-2IRMO (No. I-642) and HIV-2 EHO (No. I-643). Several additional isolateshave been obtained from West African patients, some of whom have AIDS,others with AIDS-related conditions and others with no AIDS symptoms.All of these viruses have been isolated on normal human lymphocytecultures and some of them were thereafter propagated on lymphoid tumorcell lines such as CEM and MOLT.

[0008] Due to the sero-negative or weak sero-positive results obtainedwhen using kits designed to identify HIV-1 infections in the diagnosisof these new patients with HIV-2 disease, it has been necessary todevise a new diagnostic kit capable of detecting HIV-2 infection, eitherby itself or in combination with an HIV-1 infection. The presentinventors have, through the development of cloned DNA sequencesanalogous to at least a portion of the genomic RNA of LAV-II RODviruses, created the materials necessary for the development of suchkits.

SUMMARY OF THE INVENTION

[0009] As noted previously, the present invention relates to the clonednucleotide sequences homologous or identical to at least a portion ofthe genomic RNA of HIV-2 viruses and to polypeptides encoded by thesame. The present invention also relates to kits capable of diagnosingan HIV-2 infection.

[0010] Thus, a main object of the present invention is to provide a kitcapable of diagnosing an infection caused by the HIV-2 virus. This kitmay operate by detecting at least a portion of the RNA genome of theHIV-2 virus or the provirus present in the infected cells throughhybridization with a DNA probe or it may operate through theimmunodiagnostic detection of polypeptides unique to the HIV-2 virus.

[0011] Additional objects and advantages of the present invention willbe set forth in part in the description which follows, or may be learnedfrom practice of the invention. The objects and advantages may berealized and attained by means of the instrumentalities and combinationsparticularly pointed out in the appended claims.

[0012] To achieve these objects and in accordance with the purposes ofthe present invention, cloned DNA sequences related to the entiregenomic RNA of the LAV-II virus are set forth. These sequences areanalogous specifically to the entire genome of the LAV-II ROD strain.

[0013] To further achieve the objects and in accordance with thepurposes of the present invention, a kit capable of diagnosing an HIV-2infection is described. This kit, in one embodiment, contains the clonedDNA sequences of this invention which are capable of hybridizing toviral RNA or analogous DNA sequences to indicate the presence of anHIV-2 infection. Different diagnostic techniques can be used whichinclude, but are not limited to: (1) Southern blot procedures toidentify viral DNA which may or may not be digested with restrictionenzymes; (2) Northern blot techniques to identify viral RNA extractedfrom cells; and (3) dot blot techniques, i.e., direct filtration of thesample through an ad hoc membrane such as nitrocellulose or nylonwithout previous separation on agarose gel. Suitable material for dotblot technique could be obtained from body fluids including, but notlimited to, serum and plasma, supernatants from culture cells, orcytoplasmic extracts obtained after cell lysis and removal of membranesand nuclei of the cells by ultra-centrifugation as accomplished in the“CYTODOT” procedure as described in a booklet published by Schleicherand Schull.

[0014] In an alternate embodiment, the kit contains the polypeptidescreated using these cloned DNA sequences. These polypeptides are capableof reacting with antibodies to the HIV-2 virus present in sera ofinfected individuals, thus yielding an immunodiagnostic complex.

[0015] To further achieve the objects of the invention, a vaccinatingagent is provided which comprises at least one peptide selected from thepolypeptide expression products of the viral DNA in admixture withsuitable carriers, adjuvents stabilizers.

[0016] It is understood that both the foregoing general description andthe following detailed description are exemplary and explanatory onlyand are not restrictive of the invention as claimed. The accompanyingdrawings, which are incorporated in and constitute a part of thespecification, illustrate one embodiment of the invention and, togetherwith the description, serve to explain the principles of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1 generally depicts the nucleotide sequence of a clonedcomplementary DNA (cDNA) to the genomic RNA of HIV-2.

[0018]FIG. 1A depicts the genetic organization of HIV-1, position of theHIV-1 HindIII fragment used as a probe to screen the cDNA library, andrestriction map of the HIV-2 cDNA clone, E2.

[0019]FIG. 1B depicts the nucleotide sequence of the 3′ end of HIV-2.The corresponding region of the HIV-1 LTR was aligned using the Wilburand Lipman algorithm (window: 10; K-tuple: 7; gap penalty: 3) asdescribed by Wilbur and Lipman in Proc. Natl. Acad. Sci. USA 80: 726-730(1983), specifically incorporated herein by reference. The U3-R junctionin HIV-1 is indicated and the poly A addition signal and potential TATApromoter regions are boxed. In FIG. 1B, the symbols B, H, Ps and Pvrefer to the restriction sites BamHI, HindIII, PstI and PvuII,respectively.

[0020]FIG. 2 generally depicts the HIV-2 specificity of the E2 clone.

[0021]FIGS. 2A and B specifically depict a Southern blot of DNAextracted from CEM cells infected with the following isolates:HIV-2_(ROD) (a, c), HIV-2_(DUL) (b, d), and HIV-1_(BRU) (e, f). DNA inlanes a, b, f was Pst I digested; in c, d, e DNA was undigested.

[0022]FIGS. 2C and D specifically depict dot blot hybridization ofpelleted virions from CEM cells infected by the HIV-1_(BRU)(1), SimianImmunodeficiency Virus (SIV) isolate Mm 142-83 (3), HIV-2_(DUL) (4),HIV-2_(ROD) (5), and HIV-1_(ELI) (6). Dot 2 is a pellet from anequivalent volume of supernatant from uninfected CEM. Thus, FIG. 2A andC depicts hybridization with the HIV-2 cDNA (E2) and FIG. 2B and Ddepicts hybridization to an HIV-1 probe consisting of a 9Kb SacI insertfrom HIV-1 BRU (clone lambda J 19).

[0023]FIG. 3 generally depicts a restriction map of the HIV-2 ROD genomeand its homology to HIV-1.

[0024]FIG. 3A specifically depicts the organization of three recombinantphage lambda clones, ROD 4, ROD 27, and ROD 35. In FIG. 3A, the openboxes represent viral sequences, the LTR are filled, and the dottedboxes represent cellular flanking sequences (not mapped). Only somecharacteristic restriction enzyme sites are indicated. λROD 27 and λROD35 are derived from integrated proviruses while λROD 4 is derived from acircular viral DNA. The portion of the lambda clones that hybridzes tothe cDNA E2 is indicated below the maps. A restriction map of the λRODisolate was reconstructed from these three lambda clones. In this map,the restriction sites are identified as follows: B: BamHI; E: EcoRI; H:HindIII; K: KnI; Ps: PstI; Pv: PvuII; S: SacI; X: XbaI. R and L are theright and left BamHI arms of the lambda L47.1 vector.

[0025]FIG. 3B specifically depicts dots 1-11 which correspond to thesingle-stranded DNA form of M13 subclones from the HIV-1_(BRU) clonedgenome (λJ9). Their size and position on the HIV-1 genome, determined bysequencing is shown below the figure. Dot 12 is a control containinglambda phage DNA. The dot-blot was hybridized in low stringencyconditions as described in Example 1 with the complete lambda λROD 4clone as a probe, and successively washed in 2× SSC, 0.1% SDS at 25° C.(Tm −42° C.), 1× SSC, 0.1% SDS at 60° C. (Tm −20° C.), and 0.1× SSC,0.1% SDS at 60° C. (Tm −3° C.) and exposed overnight. A duplicate dotblot was hybridized and washed in stringent conditions (as described inExample 2) with the labelled lambda J19 clone carrying the completeHIV-1_(BRU) genome. HIV-1 and HIV-2 probes were labelled the samespecific activity (10⁸ cpm/g.).

[0026]FIG. 4 generally depicts the restriction map polymorphism indifferent HIV-2 isolates and shows comparison of HIV-2 to SIV.

[0027]FIG. 4A specifically depicts DNA (20 ug. per lane) from CEM cellsinfected by the isolate HIV-2_(DUL) (panel 1) or peripheral bloodlymphocytes (PBL) infected by the isolates HIV-2_(GOM) (panel 2) andHIV-2_(MIR) (panel 3) digested with: EcoRI (a), PstI (b), and HindIII(c). Much less viral DNA was obtained with HIV-2 isolates propagated onPBL. Hybridization and washing were in stringent conditions, asdescribed in Example 2, with 10⁶ cpm/ml. of each of the E2 insert (cDNA)and the 5 kb. HindIII fragment of λROD 4, labelled to 10⁹ cpm/ug.

[0028]FIG. 4B specifically depicts DNA from HUT 78 (a human T lymphoidcell line) cells infected with STLV3 MAC isolate Mm 142-83. The sameamounts of DNA and enzymes were used as indicated in panel A.Hybridization was performed with the same probe as in A, but innon-stringent conditions. As described in Example 1 washing was for onehour in 2× SSC, 0.1% SDS at 40° C. (panel 1) and after exposure, thesame filter was re-washed in 0.1× SSC, 0.1% SDS at 60° C. (panel 2). Theautoradiographs were obtained after overnight exposition withintensifying screens.

[0029]FIG. 5 depicts the position of derived plasmids from λROD 27, λROD35 and λROD 4.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0030] Reference will now be made in detail to the presently preferredembodiments of the invention, which, together with the followingexamples, serve to explain the principles of the invention.

[0031] The genetic structure of the HIV-2 virus has been analyzed bymolecular cloning according to the method set forth herein and in theExamples. A restriction map of the genome of this virus is included inFIG. 4. In addition, the partial sequence of a cDNA complementary to thegenomic RNA of the virus has been determined. This cDNA sequenceinformation is included in FIG. 1.

[0032] Also contained herein is data describing the molecular cloning ofthe complete 9.5 kb genome of HIV-2, data describing the observation ofrestriction map polymorphism between different isolates, and an analysisof the relationship between HIV-2 and other human and simianretroviruses. From the totality of these data, diagnostic probes can bediscerned and prepared.

[0033] Generally, to practice one embodiment of the present invention, aseries of filter hybridizations of the HIV-2 RNA genome with probesderived from the complete cloned HIV-1 genome and from the gag and polgenes were conducted. These hybridizations yielded only extremely weaksignals even in conditions of very low stringency of hybrization andwashing. Thus, it was found to be difficult to assess the amount ofHIV-2 viral and proviral DNA in infected cells by Southern blottechniques.

[0034] Therefore, a complementary DNA (cDNA) to the HIV-2 genomic RNAinitially was cloned in order to provide a specific hybridization probe.To construct this cDNA, an oligo (dT) primed cDNA first-strand was madein a detergent-activated endogenous reaction using HIV-2 reversetranscriptase with virions purified from supernatants of infected CEMcells. The CEM cell line is a lymphoblastoid CD4+ cell line described byG. E. Foley et al. in Cancer 18: 522-529 (1965), specificallyincorporated herein by reference. The CEM cells used were infected withthe isolate ROD and were continuously producing high amounts of HIV-2.

[0035] After second-strand synthesis, the cDNAs were inserted into the M13 tg 130 bacteriophage vector. A collection of 10⁴ M13 recombinantphages was obtained and screened in situ with an HIV-1 probe spanning1.5 kb. of the 3′ end of the LAV_(BRU) isolate (depicted in FIG. 1A).Some 50 positive plaques were detected, purified, and characterized byend sequencing and cross-hybridizing the inserts. This procedure isdescribed in more detail in Example 1 and in FIG. 1.

[0036] The different clones were found to be complementary to the a 3′end of a polyadenylated RNA having the AATAAA signal about 20nucleotides upstream of the poly A tail, as found in the long terminalrepeat (LTR) of HIV-1. The LTR region of HIV-1 has been described by S.Wain Hobson et al. in Cell 40: 9-17 (1985), specifically incorporatedherein by reference. The portion of the HIV-2 LTR that was sequenced wasrelated only distantly to the homologous domain in HIV-1 as demonstratedin FIG. 1 B. Indeed, only about 50% of the nucleotides could be alignedand about a hundred insertions/deletions need to be introduced. Incomparison, the homology of the corresponding domains in HIV-1 isolatesfrom USA and Africa is greater than 95% and no insertions or deletionsare seen.

[0037] The largest insert of this group of M13 clones was a 2 kb. clonedesignated E2. Clone E2 was used as a probe to demonstrate its HIV-2specificity in a series of filter hybridization experiments. Firstly,this probe could detect the genomic RNA of HIV-2 but not HIV-1 instringent conditions as shown in FIG. 2, C and D. Secondly, positivesignals were detected in Southern blots of DNA from cells infected withthe ROD isolate as well as other isolates of HIV-2 as shown in FIG. 2, Aand FIG. 4, A. No signal was detected with DNA from uninfected cells orHIV-1 infected cells, confirming the exogenous nature of HIV-2. Inundigested DNA from HIV-2 infected cells, an approximately 10 kb.species, probably corresponding to linear unintegrated viral DNA, wasprincipally detected along with a species with an apparent size of 6kb., likely to be the circular form of the viral DNA. Conversely,rehybridization of the same filter with an HIV-1 probe under stringentconditions showed hybridization to HIV-1 infected cells only as depictedin FIG. 2, B.

[0038] To isolate the remainder of the genome of HIV-2, a genomiclibrary in lambda phage L47.1 was constructed. Lambda phage L47.1 hasbeen described by W. A. M. Loenen et al. in Gene 10: 249-259 (1980),specifically incorporated herein by reference. The genomic library wasconstructed with a partial Sau3AI restriction digest of the DNA from theCEM cell line infected with HIV-2_(ROD).

[0039] About 2×10⁶ recombinant plaques were screened in situ withlabelled insert from the E2 cDNA clone. Ten recombinant phages weredetected and plaque purified. Of these phages, three were characterizedby restriction mapping and Southern blot hybridization with the E2insert and probes from its 3′ end (LTR) or 5′ end (envelope), as well aswith HIV-1 subgenomic probes. In this instance, HIV-1 probes were usedunder non-stringent conditions.

[0040] A clone carrying a 9.5 kb. insert and derived from a circularviral DNA was identified as containing the complete genome anddesignated λROD 4. Two other clones, λROD 27 and λROD 35 were derivedfrom integrated proviruses and found to carry an LTR and cellularflanking sequences and a portion of the viral coding sequences as shownin FIG. 3, A.

[0041] Fragments of the lambda clones were subcloned into a plasmidvector p UC 18.

[0042] Plasmid pROD 27-5′ is derived from λROD 27 and contains the 5′2Kb of the HIV-2 genome and cellular flanking sequences (5′ LTR and 5′viral coding sequences to the EcoRI site).

[0043] Plasmid p ROD 4-8 is dervied from λROD 4 and contains the about 5Kb HindIII fragment that is the central part of the HIV-2 genome.

[0044] Plasmid pROD 27-5′ and p ROD 4.8 inserts overlap.

[0045] Plasmid pROD 4.7 contains a HindIII 1.8 Kb fragment from λROD 4.This fragment is located 3′ to the fragment subcloned into pROD 4.8 andcontains about 0.8 Kb of viral coding sequences and the part of thelambda phage (λL47.1) left arm located between the BamHI and HindIIIcloning sites.

[0046] Plasmid pROD 35 contains all the HIV-2 coding sequences 3′ to theEcoRI site, the 3′ L:R and about 4 Kb of cellular flanking sequences.

[0047] Plasmid pROD 27-5′ and pROD 35 in E. coli strain HB 101 aredeposited respectively under No. 1-626 and 1-633 at the CNCM, and havealso been deposited at the NCIB (British Collection). These plasmids aredepicted in FIG. 5. Plasmids pROD 4-7 and pROD 4-8 in E. coli strain TG1are deposited respectively under No. 1-627 and 1-628 at the CNCM.

[0048] To reconstitute the complete HIV-2 ROD genome, pROD 35 islinearized with EcoRI and the EcoRI insert of pROD 27-5′ is ligated inthe correct orientation into this site.

[0049] The relationship of HIV-2 to other human and simian retroviruseswas surmised from hybridization experiments. The relative homology ofthe different regions of the HIV-1 and 2 genomes was determined byhybridization of fragments of the cloned HIV-1 genome with the labelledλROD 4 expected to contain the complete HIV-2 genome (FIG. 3, B). Evenin very low stringency conditions (Tm −42° C.), the hybridization ofHIV-1 and 2 was restricted to a fraction of their genomes, principallythe gag gene (dots 1 and 2), the reverse transcriptase domain in pol(dot 3), the end of pol and the Q (or sor) genes (dot 5) and the F gene(or 3′ orf) and 3′ LTR (dot 11). The HIV-1 fragment used to detect theHIV-2 cDNA clones contained the dot 11 subclone, which hybridized wellto HIV-2 under non-stringent conditions. Only the signal from dot 5persisted after stringent washing. The envelope gene, the region of thetat gene and a part of pol thus seemed very divergent. These data, alongwith the LTR sequence obtained (FIG. 1, B), indicated that HIV-2 is notan envelope variant of HIV-1, as are African isolates from Zairedescribed by Alizon et al., Cell 40:63-74 (1986).

[0050] It was observed that HIV-2 is related more closely to the SimianImmunodeficiency Virus (SIV) than it is to HIV-1. This correlation hasbeen described by F. Clavel et al. in C.R. Acad. Sci. (Paris) 302:485-488 (1986) and F. Clavel et al. in Science 233: 343-346 (1986), bothof which are specifically incorporated herein by reference. SimianImmunodeficiency Virus (also designated Simian T-cell Lymphotropic VirusType 3, STLV-3) is a retrovirus first isolated from captive macaqueswith an AIDS-like disease in the USA. This simian virus has beendescribed by M. D. Daniel et al. in Science 228: 1201-1204 (1985),specifically incorporated herein by reference.

[0051] All the SIV proteins, including the envelope, are immuneprecipitated by sera from HIV-2 infected patients, whereas theserological cross-reactivity of HIV-1 to 2 is restricted to the coreproteins. However SIV and HIV-2 can be distinguished by slightdifferences in the apparent molecular weight of their proteins.

[0052] In terms of nucleotide sequence, it also appears that HIV-2 isclosely related to SIV. The genomic RNA of SIV can be detected instringent conditions as shown in FIG. 2, C by HIV-2 probes correspondingto the LTR and 3′ end of the genome (E2) or to the gag or pol genes.Under the same conditions, HIV-1 derived probes do not detect the SIVgenome as shown in FIG. 2, D.

[0053] In Southern blots of DNA from SIV-infected cells, a restrictionpattern clearly different from HIV-2_(ROD) and other isolates is seen.All the bands persist after a stringent washing, even though the signalis considerably weakened, indicating a sequence homology throughout thegenomes of HIV-2 and SIV. It has recently been shown that baboons andmacaques could be infected experimentally by HIV-2, thereby providing aninteresting animal model for the study of the HIV infection and itspreventive therapy. Indeed, attempts to infect non-human primates withHIV-1 have been successful only in chimpanzees, which are not aconvenient model.

[0054] From an initial survey of the restriction maps for certain of theHIV-2 isolates obtained according to the methods described herein, it isalready apparent that HIV-2, like HIV-1, undergoes restriction sitepolymorphism. FIG. 4 A depicts examples of such differences for threeisolates, all different one from another and from the clonedHIV-2_(ROD). It is very likely that these differences at the nucleotidelevel are accompanied by variations in the amino-acid sequence of theviral proteins, as evidenced in the case of HIV-1 and described by M.Alizon et al. in Cell 46: 63-74 (1986), specifically incorporated hereinby reference. It is also to be expected that the various isolates ofHIV-2 will exhibit amino acid heterogeneities. See, for example, Clavelet al., Nature 324 (18):691-695 (1986), specifically incorporated hereinby reference.

[0055] Further, the chacterization of HIV-2 will also delineate thedomain of the envelope glycoprotein that is responsible for the bindingof the surface of the target cells and the subsequent internalization ofthe virus. This interaction was shown to be mediated by the CD4 moleculeitself in the case of HIV-1 and similar studies tend to indicate thatHIV-2 uses the same receptor. Thus, although there is wide divergencebetween the env genes of HIV-1 and 2, small homologous domains of theenvelopes of the two HIV could represent a candidate receptor bindingsite. This site could be used to raise a protective immune responseagainst this group of retroviruses.

[0056] From the data discussed herein, certain nucleotide sequences havebeen identified which are capable of being used as probes in diagnosticmethods to obtain the immunological reagents necessary to diagnose anHIV-2 infection. In particular, these sequences may be used as probes inhybridization reactions with the genetic material of infected patientsto indicate whether the RNA of the HIV-2 virus is present in thesepatient's lymphocytes or whether an analogous DNA is present. In thisembodiment, the test methods which may be utilized include Northernblots, Southern blots and dot blots. One particular nucleotide sequencewhich may be useful as a probe is the combination of the 5 kb. HindIIIfragment of ROD 4 and the E2 cDNA used in FIG. 4.

[0057] In addition, the genetic sequences of the HIV-2 virus may be usedto create the polypeptides encoded by these sequences. Specifically,these polypeptides may be created by expression of the cDNA obtainedaccording to the teachings herein in hosts such as bacteria, yeast oranimal cells. These polypeptides may be used in diagnostic tests such asimmunofluorescence assays (IFA), radioimmunoassays (RIA) and WesternBlot tests.

[0058] Moreover, it is also contemplated that additional diagnostictests, including additional immunodiagnostic tests, may be developed inwhich the DNA probes or the polypeptides of this invention may serve asone of the diagnostic reagents. The invention described herein includesthese additional test methods.

[0059] In addition, monoclonal antibodies to these polypeptides orfragments thereof may be created. The monoclonal antibodies may be usedin immunodiagnostic tests in an analogous manner as the polypeptidesdescribed above.

[0060] The polypeptides of the present invention may also be used asimmunogenic reagents to induce protection against infection by HIV-2viruses. In this embodiment, the polypeptides produced byrecombinant-DNA techniques would function as vaccine agents.

[0061] Also, the polypeptides of this invention may be used incompetitive assays to test the ability of various antiviral agents todetermine their ability to prevent the virus from fixing on its target.

[0062] Thus, it is to be understood that application of the teachings ofthe present invention to a specific problem or environment will bewithin the capabilities of one having ordinary skill in the art in lightof the teachings contained herein. Examples of the products of thepresent invention and representative processes for their isolation andmanufacture appear above and in the following examples.

EXAMPLES Example 1 Cloning of a cDNA Complementary to Genomic RNA FromHIV-2 Virions

[0063] HIV-2 virions were purified from 5 liters of supernatant from aculture of the CEM cell line infected with the ROD isolate and a cDNAfirst strand using oligo (dT) primer was synthesized in detergentactivated endogenous reaction on pelleted virus, as described by M.Alizon et al. in Nature, 312: 757-760 (1984), specifically incorporatedherein by reference. RNA-cDNA hybrids were purified by phenol-chloroformextraction and ethanol precipitation. The second-strand cDNA was createdby the DNA polymerase I/RNAase H method of Gubler and Hoffman in Gene,25: 263-269 (1983), specifically incorporated herein by reference, usinga commercial cDNA synthesis kit obtained from Amersham. After attachmentof EcoRI linkers (obtained from Pharmacia), EcoRI digestion, andligation into EcoRI-digested dephosphorylated M13 tg 130 vector(obtained from Amersham), a cDNA library was obtained by transformationof the E. coli TG1 strain. Recombinant plaques (104) were screened insitu on replica filters with the 1.5 kb. HindIII fragment from cloneJ19, corresponding to the 31 part of the genome of the LAV_(BRU) isolateof HIV-1, ³²P labelled to a specific activity of 10⁹cpm ug. The filterswere prehybridized in 5× SSC, 5× Denhardt solution, 25% formamide, anddenatured salmon sperm DNA (100 ug/ml.) at 37° C. for 4 hours andhybridized for 16 hours in the same buffer (Tm −42° C.) plus 4×10⁷ cpmof the labelled probe (10⁶ cpm/ml. of hybridization buffer). The washingwas done in 5× SSC, 0.1% SDS at 25° C. for 2 hours. 20× SSC is 3M NaCl,0.3M Na citrate. Positive plaques were purified and single-stranded M13DNA prepared and end-sequenced according to the method described inProc. Nat'l. Acad. Sci. USA, 74: 5463-5467 (1977) of Sanger et al.

Example 2 Hybridization of DNA from HIV-1 and HIV-2 Infected Cells andRNA from HIV-1 and 2 and SIV Virons with a Probe Derived From an HIV-2Cloned cDNA

[0064] DNA was extracted from infected CEM cells continuously producingHIV-1 or 2. The DNA digested with 20 ug of PstI digested with orundigested, was electrophoresed on a 0.8% agarose gel, andSouthern-transferred to nylon membrane. Virion dot-blots were preparedin duplicate, as described by F. Clavel et al. in Science 233: 343-346(1986), specifically incorporated herein by reference, by pelletingvolumes of supernatant corresponding to the same amount of reversetranscriptase activity. Prehybridization was done in 50% formamide, 5×SSC, 5× Denhardt solution, and 100 mg./ml. denatured salmon sperm DNAfor 4 hours at 42° C. Hybridization was performed in the same bufferplus 10% Dextran sulphate, and 10⁶ cpm/ml. of the labelled E2 insert(specific activity 10⁹ cpm/ug.) for 16 hours at 42° C. Washing was in0.1× SSC, 0.1% SDS for 2×30 mn. After exposition for 16 hours withintensifying screens, the Southern blot was dehybridized in 0.4 N NaOH,neutralized, and rehybridized in the same conditions to the HIV-1 probelabelled to 10⁹ cpm/ug.

Example 3 Cloning in Lambda Phage of the Complete Provirus DNA of HIV-2

[0065] DNA from the HIV-2_(ROD) infected CEM (FIG. 2, lanes a and c) waspartially digested with Sau3AI. The 9-15 kb. fraction was selected on a5-40% sucrose gradient and ligated to BamHI arms of the lambda L47.1vector. Plaques (2×10⁶) obtained after in vitro packaging and plating onE. coli LA 101 strain were screened in situ with the insert from the E2cDNA clone. Approximately 10 positive clones were plaque purified andpropagated on E. coli C600 recBC. The ROD 4, 27, and 35 clones wereamplified and their DNA characterized by restriction mapping andSouthern blotting with the HIV-2 cDNA clone under stringent conditions,and gag-pol probes from HIV-1 used under non stringent conditions.

Example 4 Complete Genomic Sequence of the ROD HIV-2 Isolate

[0066] Experimental analysis of the HIV-2 ROD isolate yielded thefollowing sequence which represents the complete genome of this HIV-2isolate. Genes and major expression products identified within thefollowing sequence are indicated by nucleotides numbered below:

[0067] 1) GAG gene (546-2111) expresses a protein product having amolecular weight of around 55 Kd and is cleaved into the followingproteins:

[0068] a) p 16 (546-950)

[0069] b) p 26 (951-1640)

[0070] c) p 12 (1701-2111)

[0071] 2) polymerase (1829-4936)

[0072] 3) Q protein (4869-5513)

[0073] 4) R protein (5682-5996)

[0074] 5) X protein (5344-5679)

[0075] 6) Y protein (5682-5996)

[0076] 7) Env protein (6147-8720)

[0077] 8) F protein (8557-9324)

[0078] 9) TAT gene (5845-6140 and 8307-8400) is expressed by two exonsseparated by introns.

[0079] 10) ART protein (6071-6140 and 8307-8536) is similarly theexpression product of two exons.

[0080] 11) LTR:R (1-173 and 9498-9671)

[0081] 12) U5 (174-299)

[0082] 13) U3 (8942-9497)

[0083] It will be known to one of skill in the art that the absolutenumbering which has been adopted is not essential. For example, thenucleotide within the LTR which is designated as “1” is a somewhatarbitrary choice. What is important is the sequence informationprovided. GGTCGCTCTGCGGAGAGGCTGGCAGATTGAGCCCTGGGAGGTTCTCTCCAGCACTAGCAG         .         .         .         .         .         .GTAGAGCCTGGGTGTTCCCTGCTAGACTCTCACCAGCACTTGGCCGGTGCTGGGCAGACG         .         .         .       100         .         .GCCCCACGCTTGCTTGCTTAAAAACCTCTTAATAAAGCTGCCAGTTAGAAGCAAGTTAAG         .         .         .         .         .         .TGTGTGCTCCCATCTCTCCTAGTCGCCGCCTGGTCATTCGGTGTTCACCTGAGTAACAAG         .       200         .         .         .         .ACCCTGGTCTGTTAGGACCCTTCTTGCTTTGGGAAACCGAGGCAGGAAAATCCCTAGCAG         .         .         .         .         .      300GTTGGCGCCTGAACAGGGACTTGAAGAAGACTGAGAAGTCTTGGAACACGGCTGAGTGAA         .         .         .         .         .         .GGCAGTAAGGGCGGCAGGAACAAACCACGACGGAGTGCTCCTAGAAAGGCGCGGGCCGAG         .         .         .       400         .         .GTACCAAAGGCAGCGTGTGGAGCGGGAGGAGAAGAGGCCTCCGGGTGAAGGTAAGTACCT         .         .         .         .         .         .ACACCAAAAACTGTAGCCGAAAGGGCTTGCTATCCTACCTTTAGACAGGTAGAAGATTGT         .       500         .         .         .         .     MetGlyAlaArgAsnSerValLeuArgGlyLysLysAlaAspGluLeuGluArgIleGGGAGATGGGCGCGAGAAACTCCGTCTTGAGAGGGAAAAAAGCAGATGAATTAGAAAGAA         .         .         .         .         .       600 ArgLeuArgProGlyGlyLysLysLysTyrArgLeuLysHisIleValTrpAlaAlaAsnTCAGGTTACGGCCCGGCGGAAAGAAAAAGTACAGGCTAAAACATATTGTGTGGGCAGCGA         .         .         .         .         .         .  LysLeuAspArgPheGlyLeuAlaGluSerLeuLeuGluSerLysGluGlyCysGlnLysATAAATTGGACAGATTCGGATTAGCAGAGAGCCTGTTGGAGTCAAAAGAGGGTTGTCAAA         .         .         .       700         .         .  IleLeuThrValLeuAspProMetValProThrGlySerGluAsnLeuLysSerLeuPheAAATTCTTACAGTTTTAGATCCAATGGTACCGACAGGTTCAGAAAATTTAAAAAGTCTTT         .         .         .         .         .         .  AsnThrValCysValIleTrpCysIleHisAlaGluGluLysValLysAspThrGluGlyTTAATACTGTCTGCGTCATTTGGTGCATACACGCAGAAGAGAAAGTGAAAGATACTGAAG         .       800         .         .         .         .  AlaLysGlnIleValArgArgHisLeuValAlaGluThrGlyThrAlaGluLysMetProGAGCAAAACAAATAGTGCGGAGACATCTAGTGCCAGAAACAGGAACTGCAGAGAAAATGG         .         .         .         .         .       900  SerThrSerArgProThrAlaProSerSerGluLysGlyGlyAsnTyrProValGlnHisCAAGCACAAGTAGACCAACAGCACCATCTAGCGAGAAGGGAGGAAATTACCCAGTGCAAC         .         .         .         .         .         .  ValGlyGlyAsnTyrThrHisIleProLeuSerProArgThrLeuAsnAlaTrpValLysATGTAGGCGGCAACTACACCCATATACCGCTGAGTCCCCGAACCCTAAATGCCTGGGTAA         .         .         .       1000         .         .  LeuValGluGluLysLysPheGlyAlaGluValValProGlyPheGlnAlaLeuSerGluAATTAGTAGAGGAAAAAAAGTTCGGGGCAGAAGTAGTGCCAGGATTTCAGGCACTCTCAG         .         .         .         .         .         .GlyCysThrProTyrAspIleAsnGlnMetLeuAsnCysValGlyAspHisGlnAlnAlaAAGGCTGCACGCCCTATGATATCAACCAAATGCTTAATTGTGTGGGCGACCATCAAGCAG         .       1100         .         .         .         .  MetGlnIleIleArgGluIleIleAsnGluGluAlaAlaGluTrpAspValGlnLisProCCATGCAGATAATCAGGGAGATTATCAATGAGGAAGCAGCAGAATGGGATGTCCAACATC         .         .         .         .         .       1200  IleProGlyProLeuProAlaGlyGlnLeuArgGluProArgGlySerAspIleAlaGlyCAATACCAGGCCCCTTACCAGCGGGGCAGCTTAGAGAGCCAAGGGGATCTGACATAGCAG         .         .         .         .         .         .  ThrThrSerThrValGluGluGlnIleGlnTrpMetPheArgProGlnAsnProValProGGACAACAAGCACAGTAGAAGAACAGATCCAGTGGATGTTTAGGCCACAAAATCCTGTAC         .         .         .       1300         .         .ValGlyAsnIleTyrArgArgTrpIleGlnIleGlyLeuGlnLysCysValArgMetTyrCAGTAGGAAACATCTATAGAAGATGGATCCAGATAGGATTGCAGAAGTGTGTCAGGATGT         .         .         .         .         .         .  AsnProThrAsnIleLeuAspIleLysGlnGlyProLysGluProPheGlnSerTyrValACAACCCGACCAACATCCTAGACATAAAACAGGGACCAAAGGAGCCGTTCCAAAGCTATG         .       1400         .         .         .         .AspArgPheTyrLysSerLeuArgAlaGluGlnThrAspProAlaValLysAsnTrpMetTAGATAGATTCTACAAAAGCTTGAGGGCAGAACAAACAGATCCAGCAGTGAAGAATTGGA         .         .         .         .         .       1500  ThrGlnThrLeuLeuValGlnAsnAlaAsnProAspCysLysLeuValLeuLysGlyLeuTGACCCAAACACTGCTAGTACAAAATGCCAACCCAGACTGTAAATTAGTGCTAAAAGGAC         .         .         .         .         .         .  GlyMetAsnProThrLeuGluGluMetLeuThrAlaCysGlnGlyValGlyGlyProGlyTAGGGATGAACCCTACCTTAGAAGACATGCTGACCGCCTGTCAGGGGGTAGGTGGGCCAG         .         .         .       1600         .         .  GlnLysAlaArgLeuMetALaGluAlaLeuLysGluValIleGlyProAlaProIleProGCCAGAAAGCTAGATTAATGGCAGAGGCCCTGAAAGAGGTCATAGGACCTGCCCCTATCC         .         .         .         .         .         .  PheAlaAlaAlaGlnGlnArgLysAlaPheLysCysTrpAsnCysGlyLysGluGlyHisCATTCGCAGCAGCCCAGCAGAGAAAGGCATTTAAATGCTGGAACTGTGGAAAGGAAGGGC         .       1700         .         .         .         .  SerAlaArgGlnCysArgAlaProArgArgGlnGlyCysTrpLysCysGlyLysProGlyACTCGGCAAGACAATGCCGAGCACCTAGAAGGCAGGGCTGCTGGAAGTGTGGTAAGCCAG         .         .         .         .         .       1800                            ThrGlyArgPhePheArgThrGlyProLeuGly  HisIleMetThrAsnCysProAspArgGlnAlaGlyPheLeuGlyLeuGlyProTrpGlyGACACATCATGACAAACTGCCCCAGATAGACAGGCAGGTTTTTTAGGACTGGGCCCTTGGG         .         .         .         .         .         . LysGluAlaProGlnLeuProArgGlyProSerSerAlaGlyAlaAspThrAsnSerThr  LysLysProArgAsnPheProValAlaGlnValProGlnGlyLeuThrProThrAlaProGAAAGAAGCCCCGCAACTTCCCCGTGGCCCAAGTTCCGCAGGCGCTGACACCAACAGCAC         .         .         .       1900         .         . ProSerGlySerSerSerGlySerThrGlyGluIleTyrAlaAlaArgGluLysThrGlu  ProValAspProAlaValAspLeuLeuGluLysTyrMetGlnGlnGlyLysArgGlnArgCCCCAGTGGATCCAGCAGTGGATCTACTGGAGAAATATATCCAGCAAGGGAAAAGACAGA         .         .         .         .         .         . ArgAlaGluArgGluThrIleGlnGlySerAspArgGlyLeuThrAlaProArgAlaGly  GluGlnArgGluArgProTyrLysGluValThrGluAspLeuLeuHisLeuGluGlnGlyGAGAGCAGAGAGAGAGACCATACAAGGAAGTGACAGAGGACTTACTGCACCTCGAGCAGG         .       2000         .         .         .         . GlyAspThrIleGlnGlyAlaThrAsnArgGlyLeuAlaAlaProGlnPheSerLeuTrp  GluThrProTyrArgGluProProThrGluAspLeuLeuHisLeuAsnSerLeuPheGlyGGGAGACACCATACAGGGAGCCACCAACAGAGGACTTGCTGCACCTCAATTCTCTCTTTG         .         .         .         .         .       2100 LysArgProValValThrAlaTyrIleGluGlyGlnProValGluValLeuLeuAspThr  LysAspGln GAAAAGACCAGTAGTCACAGCATACATTGAGGGTCAGCCAGTAGAAGTCTTGTTAGACAC         .         .         .         .         .         . GlyAlaAspAspSerIleValAlaGlyIleGluLeuGlyAsnAsnTyrSerProLysIleAGGGGCTGACGACTCAATAGTAGCAGGAATAGAGTTAGGGAACAATTATAGCCCAAAAAT         .         .         .       2200         .         . ValGlyGlyIleGlyGlyPheIleAsnThrLysGluTyrLysAsnValGluIleGluValAGTAGGGGGAATAGGGGGATTCATAAATACCAAGGAATATAAAAATGTAGAAATAGAAGT         .         .         .         .         .         . LeuAsnLysLysValArgAlaThrIleMetThrGlyAspThrProIleAsnIlePheGlyTCTAAATAAAAAGGTACGGGCCACCATAATGACAGGCGACACCCCAATCAACATTTTTGG         .       2300         .         .         .         . ArgAsnIleLeuThrAlaLeuGlyMetSerLeuAsnLeuProValAlaLysValGluProCAGAAATATTCTGACAGCCTTAGGCATGTCATTAAATCTACCAGTCGCCAAAGTAGAGCC         .         .         .         .         .       2400 IleLysIleMetLeuLysProGlyLysAspGlyProLysLeuArgGlnTrpProLeuThrAATAAAAATAATGCTAAAGCCAGGGAAAGATGGACCAAAACTGAGACAATGGCCCTTAAC         .         .         .         .         .         . LysGluLysIleGluAlaLeuLysGluIleCysGluLysMetGluLysGluGlyGlnLeuAATAAAAATAATGCTAAAGCCAGGGAAAGATGGACCAAAACTGAGACAATGGCCCTTAAC         .         .         .         .         .         . LysGluLysIleGluAlaLeuLysGluIleCysGluLysMetGluLysGluGlyGlnLeuAAAAGAAAAAATAGAAGCACTAAAAGAAATCTGTGAAAAAATGGAAAAAGAAGGCCAGCT         .         .         .       2500         .         . GluGluAlaProProThrAsnProTyrAsnThrProThrPheAlaIleLysLysLysAspAGAGGAAGCACCTCCAACTAATCCTTATAATACCCCCAGATTTGCAATCAAGAAAAAGGA         .         .         .         .         .         . LysAsnLysTrpArgMetLeuIleAspPheArgGluLeuAsnLysValThrGlnAspPheCAAAAACAAATGGAGGATGCTAATAGATTTCAGAGAACTAAACAAGGTAACTCAAGATTT         .       2600         .         .         .         . ThrTluIleGlnLeuGlyIleProHisProAlaGlyLeuAlaLysLysArgArgIleThrCACAGAAATTCAGTTAGGAATTCCACACCCAGCAGGGTTGGCCAAGAAGAGAAGAATTAC         .         .         .         .         .       2700 ValLeuAspValGlyAspAlaTyrPheSerIleProLeuHisGluAspPheArgProTyrTGTACTAGATGTAGGGGATGCTTACTTTTCCATACCACTACATGAGGACTTTAGACCATA         .         .         .         .         .         . ThrAlaPheThrLeuProSerValAsnAsnAlaGluProGlyLysArgTyrIleTyrLysTACTGCATTTACTCTACCATCAGTGAACAATGCAGAACCAGGAAAAAGATACATATATAA         .         .         .       2800         .         . ValLeuProGlnGlyTrpLysGlySerProAlaIlePheGlnHisThrMetArgGlnValAGTCTTGCCACAGGGATGGAAGGGATCACCAGCAATTTTTCAACACACAATGAGACAGGT         .         .         .         .         .         . LeuGluProPheArgLysAlaAsnLysAspValIleIleIleGlnTyrMetAspAspIleATTAGAACCATTCAGAAAAGCAAACAAGGATGTCATTATCATTCAGTACATGGATGATAT         .       2900         .         .         .         . LeuIleAlaSerAspArgThrAspLeuGluHisAspArgValValLeuGlnLeuLysGluCTTAATAGCTAGTGACAGGACAGATTTAGAACATGATAGGGTAGTCCTGCAGCTCAAGGA         .         .         .         .         .       3000 LeuLeuAsnGlyLeuGlyPheSerThrProAspGluLysPheGlnLysAspProProTyrACTTCTAAATGGCCTAGGATTTTCTACCCCAGATGAGAAGTTCCAAAAAGACCCTCCATA         .         .         .         .         .         . HisTrpMetGlyTyrGluLeuTrpProThrLysTrpLysLeuGlnLysIleGlnLeuProCCACTGGATGGGCTATGAACTATGGCCAACTAAATGGAAGTTGCAGAAAATACAGTTGCC         .         .       3100         .         .         . GlnLysGluIleTrpThrValAsnAspIleGlnLysLeuValGlyValLeuAsnTrpAlaCCAAAAAGAAATATGGACAGTCAATGACATCCAGAAGCTAGTGGGTGTCCTAAATTGGGC         .         .         .         .         .         . AlaGlnLeuTyrProGlyIleLysThrLysHisLeuCysArgLeuIleArgGlyLysMetAGCACAACTCTACCCAGGGATAAAGACCAAACACTTATGTAGGTTAATCAGAGGAAAAAT         .       3200         .         .         .         . ThrLeuThrGluGluValGlnTrpThrGluLeuAlaGluAlaGluLeuGluGluAsnArgGACACTCACAGAAGAAGTACAGTGGACAGAATTACCAGAAGCAGAGCTAGAAGAAAACAG         .         .         .         .         .       3300 IleIleLeuSerGlnGluGlnGluGlyHisTyrTyrGlnGluGluLysGluLeuGluAlaAATTATCCTAAGCCAGGAACAAGAGGGACACTATTACCAAGAAGAAAAAGAGCTAGAAGC         .         .         .         .         .         . ThrValGlnLysAspGlnGluAsnGlnTrpThrTyrLysIleHisGlnGluGluLysIleAACAGTCCAAAAGGATCAAGAGAATCAGTGGACATATAAAATACACCAGGAAGAAAAAAT         .         .         .       3400         .         . LeuLysValGlyLysTyrAlaLysValLysAsnThrHisThrAsnGlyIleArgLeuLeuTCTAAAAGTAGGAAAATATGCAAAGGTGAAAAACACCCATACCAATGGAATCAGATTGTT         .         .         .         .         .         . AlaGlnValValGlnLysIleGlyLysGluAlaLeuValIleTrpGlyArgIleProLysAGCACAGGTAGTTCAGAAAATAGGAAAAGAAGCACTAGTCATTTGGGGACGAATACCAAA         .       3500         .         .         .         . PheHisLeuProValGluArgGluIleTrpGluGlnTrpTrpAspAsnTyrTrpGlnValATTTCACCTACCAGTAGAGAGAGAAATCTGGGAGCAGTGGTGGGATAACTACTGGCAAGT         .         .         .         .         .       3600 ThrTrpIleProAspTrpAspPheValSerThrProProLeuValArgLeuAlaPheAsnGACATGGATCCCAGACTGGGACTTCGTGTCTACCCCACCACTGGTCAGGTTAGCGTTTAA         .         .         .         .         .         . LeuGluGlnThrThrAsnGlnGlnAlaGluLeuGluAlaPheAlaMetAlaLeuThrAspACTAGAGCAAACTACCAATCAGCAAGCAGAACTAGAAGCCTTTGCGATGGCACTAACAGA         .       3800.         .         .         .         . SerGlyProLysValAsnIleIleValAspSerGlnTyrValMetGlyIleSerAlaSerCTCGGGTCCAAAAGTTAATATTATAGTAGACTCACAGTATGTAATGGGGATCAGTCCAAG         .         .         .         .         .       3900 GlnProThrGluSerGluSerLysIleValAsnGlnIleIleGluGluMetIleLysLysCCAACCAACAGAGTCAGAAAGTAAAATAGTGAACCAGATCATAGAAGAAATGATAAAAAA         .         .         .         .         .         . GluAlaIleTyrValAlaTrpValProAlaHisLysGlyIleGlyGlyAsnGlnGluValGGAAGCAATCTATGTTGCATGGGTCCCAGCCCACAAAGGCATAGGGGGAAACCAGGAAGT         .         .         .       4000         .         . AspHisLeuValSerGlnGlyIleArgGlnValLeuPheLeuGluLysIleGluProAlaAGATCATTTAGTGAGTCAGGGTATCAGACAAGTGTTGTTCCTGGAAAAAATAGAGCCCGC         .         .         .         .         .         . GlnGluGluHisGluLysTyrHisSerAsnValLysGluLeuSerHisLysPheGlyIleTCAGGAAGAACATGAAAAATATCATAGCAATGTAAAAGAACTGTCTCATAAATTTGGAAT         .       4100         .         .         .         . ProAsnLeuValAlaArgGlnIleValAsnSerCysAlaGlnCysGlnGlnLysGlyGluACCCAATTTAGTGGCAAGGCAAATAGTAAACTCATGTGCCCAATGTCAACAGAAAGGGGA         .         .         .         .         .       4200 AlaIleHisGlyGlnValAsnAlaGluLeuGlyThrTrpGlnMetAspCysThrHisLeuAGCTATACATGGGCAAGTAAATGCAGAACTAGGCACTTGGCAAATGGACTGCACACATTT         .         .         .         .         .         . GluGlyLysIleIleIleValAlaValHisValAlaSerGlyPheIleGluAlaGluValAGAAGGAAAGATCATTATAGTAGCAGTACATGTTGCAAGTGGATTTATAGAAGCAGAAGT         .         .         .       4300         .         . IleProGluGluSerGlyArgGlnThrAlaLeuPheLeuLeuLysLeuAlaSerArgTrpCATCCCACAGGAATCAGGAAGACAAACAGCACTCTTCCTATTGAAACTGGCAAGTAGGTG         .         .         .         .         .         . ProIleThrHisLeuHisThrAspAsnGlyAlaAsnPheThrSerGlnGluValLysMetGCCAATAACACACTTGCATACAGATAATGGTGCCAACTTCACTTCACAGGAGGTGAAGAT         .       4400         .         .         .         . ValAlaTrpTrpIleGlyIleGluGlnSerPheGlyValProTyrAsnProGlnSerGlnGGTAGCATGGTGGATAGGTATAGAACAATCCTTTGGAGTACGTTACAATCCACAGAGCCA         .         .         .         .         .       4500 GlyValValGluAlaMetAsnHisHisLeuLysAsnGlnIleSerArgIleArgGluGlnAGGAGTAGTAGAAGCAATGAATCACCATCTAAAAAACCAAATAAGTAGAATCAGAGAACA         .         .         .         .         .         . AlaAsnThrIleGluThrIleValLeuMetAlaIleHisCysMetAsnPheLysArgArgGGCAAATACAATAGAAACAATAGTACTAATGGCAATTCATTGCATGAATTTTAAAAGAAG         .         .         .       4600         .         . GlyGlyIleGlyAspMetThrProSerGluArgLeuIleAsnMetIleThrThrGluGlnGGGGGGAATAGGGGATATGACTCCATCAGAAAGATTAATCAATATGATCACCACAGAACA         .         .         .         .         .         . GluIleGlnPheLeuGlnAlaLysAsnSerLysLeuLysAspPheArgValTyrPheArgAGAAGGCAGAGATCAGTTGTGGAAAGGACCTGGGGAACTACTGTGGAAAGGAGAAGGAGC         .       4700         .         .         .         . GluGlyArgAspGlnLeuTrpLysGlyProGlyGluLeuLeuTrpLysGlyGluGlyAlaAGAAGGCAGAGATCAGTTGTGGAAAGGACCTGGGGAACTACTGTGGAAAGGAGAAGGAGC         .         .         .         .         .       4800 ValLeuValLysValGlyThrAspIleLysIleIleProArgArgLysAlaLysIleIleAGTCCTAGTCAAGGTAGGAACAGACATAAAAATAATACCAAGAAGGAAAGCCAAGATCAT         .         .         .         .         .         . ArgAspTyrGlyGlyArgGlnGluMetAspSerGlySerHisLeuGluGlyAlaArgGlu        MetGluGluAspLysArgTrpIleValValProThrTrpArgValProGlyArgCAGAGACTATGGAGGAAGACAAGAGATGGATAGTGGTTCCCACCTGGAGGGTGCCAGGGA         .         .         .       4900         .         . AspGlyGluMetAla  MetGluLysTrpHisSerLeuValLysTryLeuLysTyrLysThrLysAspLeuGluLysGGATGGAGAAATGGCATAGCCTTGTCAAGTATCTAAAATACAAAACAAAGGATCTAGAAA         .         .         .         .         .         .  ValCysTyrValProHisHisLysValGlyTrpAlaTrpTrpThrCysSerArgValIleAGGTGTGCTATGTTCCCCACCATAAGGTGGGATGGGCATGCTGGACTTGCAGCAGGGTAA           .       5000         .         .         .         .    PheProLeuLysGlyAsnSerHisLeuGluIleGlnAlaTyrTrpAsnLeuThrProGlu  TATTCCCATTAAAAGGAAACAGTCATCTAGAGATACAGGCATATTGGAACTTAACACCAG           .         .         .         .         .       5100    LysGlyTrpLeuSerSerTyrSerValArgIleThrTrpTyrThrGluLysPheTrpThr  AAAAAGGATGGCTCTCCTCTTATTCAGTAAGAATAACTTGGTACACAGAAAAGTTCTGGA           .         .         .         .         .         .    AspValThrProAspCysAlaAspValLeuIleHisSerThrTyrPheProCysPheThr  CAGATGTTACCCCAGACTGTGCAGATGTCCTAATACATAGCACTTATTTCCCTTGCTTTA           .         .         .       5200         .         .    AlaGlyGluValArgArgAlaIleArgGlyGluLysLeuLeuSerCysCysAsnTyrPro  CAGCAGGTGAAGTAAGAAGAGCCATCAGAGGGGAAAAGTTATTGTCCTGCTGCAATTATC           .         .         .         .         .         .    ArgAlaHisArgAlaGlnValProSerLeuGlnPheLeuAlaLeuValValValGlnGlnCCCGAGCTCATAGAGCCCAGGTACCGTCACTTCAATTTCTGGCCTTAGTGGTAGTGCAAC           .       5300          .         .         .         .   MetThrAspProArgGluThrValProProGlyAsnSerGlyGluGluThrIleGly  AsnAspArgProGluArgAspSerThrThrArgLysGlnArgArgArgAspTyrArgArgAAAATGACAGACCCCAGAGAGACAGTACCACCAGGAAACAGCCGCGAAGAGACTATCGGA         .         .         .         .         .       5400GluAlaPheAlaTrpLeuAsnArgThrValGluAlaIleAsnArgGluAlaValAsnHis  GlyLeuArgLeuAlaLysGlnAspSerArgSerHisLysGlnArgSerSerGluSerProGAGGCCTTCGCCTGGCTAAACAGGACAGTAGAAGCCATAAACAGAGAAGCAGTGAATCAC         .         .         .         .         .         .LeuProArgGluLeuIlePheGluValTrpGlnArgSerTrpArgTyrTrpHisAspGlu  ThrProArgThrTyrPheProGlyValAlaGluValLeuGluIleLeuAlaCTACCCCGAGAACTTATTTTCCAGGTGTGGCAGACGTCCTGGAGATACTGGCATGATGAA         .         .         .       5500         .         .GlnGlyMetSerGluSerTyrThrLysTyrArgTyrLeuCysIleIleGlnLysAlaValCAAGGGATGTCAGAAAGTTACACAAAGTATAGATATTTGTGCATAATACAGAAAGCACTG         .         .         .         .         .         .TyrMetHisValArgLysGlyCysThrCysLeuGlyArgGlyHisGlyProGlyGlyTrpTACATGCATGTTAGGAAAGGGTGTACTTGCCTGGGGACGGGACATGGGCCAGGAGGGTGG         .       5600         .         .         .         .ArgProGlyProProProProProProProGlyLeuVal                                         MetAlaGluAlaProThrGluAGACCAGGGCCTCCTCCTCCTCCCCCTCCAGGTCTGGTGTAATGGCTGAAGCACCAACAG         .         .         .         .         .       5700  LeuProProValAspGlyThrProLeuArgGluProGlyAspGluTrpIleIleGluIleAGCTCCCCCCGGTGGATGGGACCCCACTGAGGGAGCCAGGGGATGAGTGGATAATAGAAA         .         .         .         .         .         .  LeuArgGluIleLysGluGluAlaLeuLysHisPheAspProArgLeuLeuIleAlaLeuTCTTGAGAGAAATAAAAGAAGAAGCTTTAAAGCATTTTGACCCTCGCTTGCTAATTGCTC         .         .         .       5800         .         .                        MetGluThrPreLeuLysAlaProGluSerSerLeu  GlyLysTyrIleTyrThrArgHisGlyAspThrLeuGluGlyAlaArgGluLeuIleLysTTGGCAAATATATCTATACTAGACATGGAGACACCCTTGAAGGCGCCAGAGAGCTCATTA         .         .         .         .         .         .LysSerCysAsnGluProPheSerArgThrSerGluGlnAspValAlaThrGlnGluLeu  ValLeuGlnArgAlaLeuPheThrHisPheArgAlaGlyCysGlyHisSerArgIleGlyAAGTCCTGCAACGAGCCCTTTTCACGCACTTCAGAGCACTTCACAGCAGGATGTGGCCACTCAAGAATTG         .       5900         .         .         .         .AlaArgGlnGlyGluGluIleLeuSerGlnLeuTyrArgProLeuGluThrCysAsnAsn  GlnThrArgGlyGlyAsnProLeuSerAlaIleProThrProArgAsnMetGlnGCCAGACAAGGGGAGGAAATCCTCTCTCAGCTATACCGACCCCTAGAAACATGCAATAAC         .         .         .         .         .       6000SerCysTyrCysLysArgCysCysTyrHisCysGlnMetCysPheLeuAsnLysGlyLeuTCATGCTATTGTAAGCCATGCTGCTACCATTGTCAGATGTGTTTTCTAAACAAGGGGCTC         .         .         .         .         .         .GlyIleCysTyrGluArgLysGlyArgArgArgArgThrProLysLysThrLysThrHis          MetAsnGluArgAlaAspGluGluGlyLeuGlnArgLysLeuArgLeuIleGGGATATGTTATGAACGAAAGGGCAGACGAAGAAGCACTCCAAAGAAAACTAAGACTCAT         .         .         .       6100         .         .ProSerProThrProAspLys  ArgLeuLeuHisGlnThr                           MetMetAsnGlnLeuLeuIleAlaIleLeuLeuAlaCCGTCTCCTACACCAGACAAGTGAGTATGATGAATCAGCTGCTTATTGCCATTTTATTAG         .         .         .         .         .         .  SerAlacysLeuValTyrCysThrGlnTyrValThrValPheTyrGlyValProThrTrpCTAGTGCTTGCTTAGTATATTGCACCCAATATGTAACTGTTTTCTATGGCGTACCCACGT         .       6200         .         .         .         .  LysAsnAlaThrIleProLeuPheCysAlaThrArgAsnArgAspThrTrpGlyThrIleGGAAAAATGCAACCATTCCCCTCTTTTGTGCAACCAGAAATAGGGATACTTGGGGAACCA         .         .         .         .         .       6300  GlnCysLeuProAspAsnAspAspTyrGlnGluIleThrLeuAsnValThrGluAlaPheTACAGTGCTTGCCTGACAATGATGATTATCAGGAAATAACTTTGAATGTAACAGAGGCTT         .         .         .         .         .         .AspAlaTrpAsnAsnThrValThrGluGlnAlaIleGluAspValTrpHisLeuPheGluTTGATGCATGGAATAATACAGTAACAGAACAAGCAATAGAAGATGTCTGGCATCTATTCG         .         .         .       6400         .         .  ThrSerIleLysProCysValLysLeuThrProLeuCysValAlaMetLysCysSerSerAGACATCAATAAAACCATGTGTCAAACTAACACCTTTATGTGTAGCAATGAAATGCAGCA         .         .         .         .         .         .  ThrGluSerSerThrGlyAsnAsnThrThrSerLysSerThrSerThrThrThrThrThrGCACAGAGAGCAGCACAGGGAACAACACAACCTCAAAGAGCACAAGCACAACCACAACCA         .       6500         .         .         .         .  ProThrAspGlnGluGlnGluIleSerGluAspThrProcysAlaArgAlaAspAsnCysCACCCACAGACCAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGACAACT         .         .         .         .         .       6600  SerGlyLeuGlyGluGluGluThrIleAsncysGlnPheAsnMetThrGlyLeuGluArgGCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTCAATATGACAGGATTAGAAA         .         .         .         .         .         .  AspLysLysGlnTyrAsnGluThrTrpTyrSerLysAspValValCysGluThrAsnGAGATAAGAAAAAACAGTATAATGAAACATGGTACTCAAAAGATGTGGTTTGTGAGACAA         .         .         .       6700         .         .  AsnSerThrAsnGlnThrGlnCysTyrMetAsnHisCysAsnThrSerValIleThrGluATAATAGCACAAATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATCACAG         .         .         .         .         .         .  SerCysAspLysHisTyrTrpAspAlaIleArgPheArgTyrCysAlaProProGlyTyrAATCATGTGACAAGCACTATTGGGATGCTATAAGGTTTAGATACTGTGCACCACCGGGTT         .       6800         .         .         .         .  AlaLeuLeuArgCysAsnAspThrAsnTyrSerGlyPheAlaProAsnCysSerLysValATGCCCTATTAAGATGTAATGATACCAATTATTCAGGCTTTGCACCCAACTGTTCTAAAG         .         .         .         .         .       6900  ValAlaSerThrCysThrArgMetMetGluThrGlnThrSerThrTrpPheGlyPheAsnTAGTAGCTTCTACATGCACCAGGATGATGGAAACGCAAACTTCCACATGGTTTGGCTTTA         .         .         .         .         .         .  GlyThrArgAlaGluAsnArgThrTyrIleTyrTrpHisGlyArgAspAsnArgThrIleATGGCACTAGAGCAGAGAATAGAACATATATCTATTGGCATGGCAGAGATAATAGAACTA         .         .         .       7000         .         .  IleSerLeuAsnLysTyrTyrAsnLeuSerLeuHisCysLysArgProGlyAsnLysThrTCATCAGCTTAAACAAATATTATAATCTCAGTTTGCATTGTAAGAGGCCAGGGAATAAGA         .         .         .         .         .         .  ValLysGlnIleMetLeuMetSerGlyHisValPheHisSerHisTyrGlnProIleAsnCAGTGAAACAAATAATGCTTATGTCAGGACATGTGTTTCACTCCCACTACCAGCCGATCA         .       7100.         .         .         .         .  LysArgProArgGlnAlaTrpCysTrpPheLysGlyLysTrpLysAspAlaMetGlnGluATAAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAATGGAAAGACGCCATGCAGG         .         .         .         .         .       7200  ValLysGluThrLeuAlaLysHisProArgTyrArgGlyThrAsnAspThrArgAsnIleAGGTGAAGGAAACCCTTGCAAAACATCCCAGGTATAGAGGAACCAATGACACAAGGAATA         .         .         .         .         .         .  SerPheAlaAlaProGlyLysGlySerAspProGluValAlaTyrMetTrpThrAsnCysTTAGCTTTGCAGCGCCAGGAAAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAACT         .         .         .       7300         .         .  ArgGlyGluPheLeuTyrCysAsnMetTHrTrpPheLeuAsnTrpIleGluAsnLysThrGCAGAGGAGAGTTTCTCTACTGCAACATGACTTGGTTCCTCAATTGGATAGAGAATAAGA         .         .         .         .         .         .  HisArgAsnTyrAlaProCysHisIleLysGlnIleIleAsnThrTrpHisLysValGlyCACACCGCAATTATGCACCGTGCCATATAAAGCAAATAATTAACACATGGCATAAGGTAG         .       7400         .         .         .         .  ArgAsnValTyrLeuProProArgGluGlyGluLeuSerCysAsnSerThrValThrSerGGAGAAATGTATATTTGCCTCCCAGGGAACGGGAGCTGTCCTGCAACTCAACAGTAACCA         .         .         .         .         .       7500  IleIleAlaAsnIleAspTrpGlnAsnAsnAsnGlnThrAsnIleThrPheSerAlaGluGCATAATTGCTAACATTGACTGGCAAAACAATAATCAGACAAACATTACCTTTAGTGCAG         .         .         .         .         .         .  ValAlaGluLeuTyrArgLeuGluLeuGlyAspTyrLysLeuValGluIleThrProIleAGGTGGCAGAACTATACAGATTGGAGTTGGGAGATTATAAATTGGTAGAAATAACACCAA         .       7600         .         .         .         .  GlyPheAlaProThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArgGlyTTGGCTTCGCACCTACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGAG         .         .         .         .         .         .  ValPheValLeuGlyPheLeuGlyPheLeuAlaThrAlaGlySerAlaMetGlyAlaAlaGTGTGTTCGTGCTAGGGTTCTTGGGTTTTCTCGCAACAGCAGGTTCTGCAATGGGCGCGG         .       7700         .         .         .         .  SerLeuThrValSerAlaGlnSerArgThrLeuLeuAlaGlyIleValGlnGlnGlnGlnCGTCCCTGACCGTGTCGGCTCAGTCCCGGACTTTACTGGCCGGGATAGTGCAGCAACAGC         .         .         .         .         .       7800  GlnLeuLeuAspValValLysArgGlnGlnGluLeuLeuArgLeuThrValTrpGlyThrAACAGCTGTTGGACGTGGTCAAGAGACAACAAGAACTGTTGCGACTGACGGTCTGGGGAA         .         .         .         .         .         .  LysAsnLeuGlnAlaArgValThrAlaIleGluLysTyrLeuGlnAspGlnAlaArgLeuCGAAAAACCTCCAGGCAAGAGTCACTGCTATAGAGAAGTACCTACAGGACCAGGCGCGGC         .         .         .       7900         .         .  AsnSerTrpGlyCysAlaPheArgGlnValCysHisThrThrValProTrpValAsnAspTAAATTCATGGGGATGTGCGTTTAGACAAGTCTGCCACACTACTGTACCATGGGTTAATG         .         .         .         .         .         .  SerLeuAlaProAspTrpAspAsnMetThrTrpGlnGluTrpGluLysGlnValArgTyrATTCCTTAGCACCTGACTGGGACAATATGACGTGGCAGGAATGGGAAAAACAAGTCCGCT         .       8000         .         .         .         .  LeuGluAlaAsnIleSerLysSerTrpAspIlePheGlyAsnTrpPheAspLeuThrSerTGTATGAACTACAAAAATTAAATAGCTGGGATATTTTTGGCAATTGGTTTGACTTAACCT         .         .         .         .         .       8100  TyrGluLeuGlnLysLeuAsnSerTrpAspIlePheGlyAsnTrpPheAspLeuThrSerTGTATGAACTACAAAAATTAAATAGCTGGGATATTTTTGGCAATTGGTTTGACTTAACCT         .         .         .         .         .         .  TrpValLysTyrIleGlnTyrGlyValLeuIleIleValAlaValIleAlaLeuArgIleCCTGGGTCAAGTATATTCAATATGGAGTGCTTATAATAGTAGCAGTAATAGCTTTAAGAA         .         .         .       8200         .         .  ValIleTyrValValGlnMetLeuSerArgLeuArgLysGlyTyrArgProValPheSerTAGTGATATATGTAGTACAAATGTTAAGTAGGCTTAGAAAGGGCTATAGGCCTGTTTTCT         .         .         .         .         .         .                           SerIleSerThrArgThrGlyAspSerGlnPro                         AsnProTyrProGlnGlyProGlyThrAlaSerGln  SerProProGlyTyrIleGlnGlnIleHisIleHisLysAspArgGlyGlnProAlaAsnCTTCCCCCCCCGGTTATATCCAACAGATCCATATCCACAAGGACCGGGGACAGCCAGCCA         .       8300         .         .         .         .ThrLysLysGlnLysLysThrValGluAlaThrValGluThrAspThrGlyProGlyArg ArgArgAsnArgArgArgArgTrpLysGlnArgTrpArgGlnIleLeuAlaLeuAlaAsp  GluGluThrGluGluAspGlyGlySerAsnGlyGlyAspArgTyrTrpProTrpProIle         .         .         .         .         .       8400 SerIleTyrThrPheProAspProProAlaAspSerProLeuAspGlnThrIleGlnHis  AlaTyrIleHisPheLeuIleArgGlnLeuIleArgLeuLeuThrArgLeuTyrSerIleTAGCATATATACATTTCCTGATCCGCCAGCTGATTCGCCTCTTGACCAGACTATACAGCA         .         .         .         .         .         . LeuGlnGlyLeuThrIleGlnGluLeuProAspProProThrHisLeuProGluSerGln  GysArgAspLeuLeuSerArgSerPheLeuThrLeuGlnLeuIleTyrGlnAsnLeuArgTCTGCAGGGACTTACTATCCAGGAGCTTCCTGACCCTCCAACTCATCTACCAGAATCTCA         .         .         .       8500         .         . ArgLeuAlaGluThr                     MetGlyAlaSerGlySerLysLys  AspTrpLeuArgLeuArgThrAlaPheLeuGlnTyrGlyCysGluTrpIleGlnGluAlaGAGACTGGCTGAGACTTAGAACAGCCTTCTTGGAATATGGGTGCGAGTGGATCCAAGAAG         .         .         .         .         .         .HisSerArgProProArgGlyLeuGlnGluARgLeuLeuArgLalArgAlaGlyAlaCys  PheGlnAlaAlaAlaArgAlaThrArgGluThrLeuAlaGlyAlaCysArgGlyLeuTrpCATTCCAGGCCGCCGCGAGGGCTACAAGAGAGACTCTTGCGGGCGCGTGCAGGGGCTTGT         .       8600         .         .         .         .GlyGlyTyrTrpAsnGluSerGlyGlyGluTyrSerArgPheGlnGluGlySerAspArg  ArgValLeuGluArgIleGlyArgGlyileLeuAlaValProArgARgIleArgGlnGlyGCAGGGTATTGGAACGAATCGGGAGGGGAATACTCGCGGTTCCAAGAAGGATCAGACAGG         .         .         .         .         .       8700GluGlnLysSerProSerCysGluGlyArgGlnTyrGlnGlnGlyAspPheMetAsnThr  AlaGluIleAlaLeuLeuGAGCAGAAATCGCCCTCCTGTGAGGGACGGCAGTATCAGCAGGGAGACTTTATGAATACT         .         .         .         .         .         .ProTrpLysAspProAlaAlaGluArgGluLysAsnLeuTyrArgGlnGlnAsnMetAspCCATGGAAGGACCCAGCAGCAGAAAGGGAGAAAAATTTGTACAGGCAACAAAATATGGAT         .         .         .       8800         .         .AspValAspSerAspAspAspAspGlnValArgValSerValThrProLysValProLeuGATGTAGATTCAGATGATGATGACCAAGTAAGAGTTTCTGTCACACCAAAAGTACCACTA         .         .         .         .         .         .ArgProMetThrHisArgLeuAlaIleAspMetSerHisLeuIleLysThrArgGlyGlyAGACCAATGACACATAGATTGGCAATAGATATGTCACATTTAATAAAAACAAGGGGGGGA         .       8900         .         .         .         .LeuGluGlyMetPheTyrSerGluArgArgHisLysIleLeuAsnIleTyrLeuGluLysCTGGAAGGGATGTTTTACAGTGAAAGAAGACATAAAATCTTAAATATATACTTAGAAAAG         .         .         .         .         .       9000GluGluGlyIleIleAlaAspTrpGlnAsnTyrThrHisGlyProGlyValArgTyrProGAAGAAGGGATAATTGCAGATTGGCAGAACTACACTCATGGGCCAGGAGTAAGATACCA         .         .         .         .         .         .MetPhePheGlyTrpLeuTrpLysLeuValProValAspValProGlnGluGlyGluAspATGTTCTTTGGGTGGCTATGGAAGCTAGTACCAGTAGATGTCCCACAAGAAGGGGAGGAC         .         .         .       9100         .         .ThrGluThrHisCysLeuValHisProAlaGlnThrSerLysPheAspAspProHisGlyACTGAGACTCACTGCTTAGTACATCCAGCACAAACAAGCAAGTTTGATGACCCGCATGGG         .         .         .         .         .         .GluThrLeuValTrpGluPheAspProLeuLeuAlaTyrSerTyrGluAlaPheIleArgGAGACACTAGTCTGGGAGTTTGATCCCTTGCTGGCTTATAGTTACGAGGCTTTTATTCGG         .       9200         .         .         .         .TyrProGluGluPheGlyHisLysSerGlyLeuProGluGluGluTrpLysAlaArgLeuTACCCAGAGGAATTTGGGCACAAGTCAGGCCTGCCAGAGGAAGAGTGGAAGGCGAGACTG         .         .         .         .         .       9300LysAlaArgGlyIleProPheSerAAAGCAAGAGGAATACCATTTAGTTAAAGACAGGAACAGCTATACTTGGTCAGGGCAGGA         .         .         .         .         .         .AGTAACTAACAGAAACAGCTGAGACTGCAGGGACTTTCCAGAAGGGGCTGTAACCAAGGG         .         .         .       9400         .         .AGGGACATGGGAGGAGCTGGTGGGGAACGCCCTCATATTCTCTGTATAAATATACCCGCT         .         .         .         .         .         .AGCTTGCATTGTACTTCGGTCGCTCTGCGGAGAGGCTGGCAGATTGAGCCCTGGGAGGTT         .       9500         .         .         .         .CTCTCCAGCAGTAGCAGGTAGAGCCTGGGTGTTCCCTGCTAGACTCTCACCAGCACTTGG         .         .         .         .         .       9600CCGGTGCTGGGCAGACGGCCCCACGCTTGCTTGCTTAAAAACCTCCTTAATAAAGCTGCC         .         .         .         .         .         . AGTTAGAAGCA         .

Example 5 Sequences of the Coding Regions for the Envelope Protein andGAG Product of the ROD HIV-2 Isolate

[0084] Through experimental analysis of the HIV-2 ROD isolate, thefollowing sequences were identified for the regions encoding the env andgag gene products. One of ordinary skill in the art will recognize thatthe numbering for both gene regions which follow begins for conveniencewith “1” rather than the corresponding number for its initial nucleotideas given in Example 4, above, in the context of the complete genomicsequence.

[0085] Envelope Sequence MetMetAsnGlnLeuLeuIleAlaIleLeuLeuAlaSerAlaCysATGATGAATCAGCTGCTTATTGCCATTTTATTAGCTAGTGCTTGC         .         .         .         .LeuValTyrCysThrGlnTyrValThrValPheTyrGlyValProTTAGTATATTGCACCCAATATGTAACTGTTTTCTATGGCGTACCC    .         .         .         .         .ThrTrpLysAsnAlaThrIleProLeuPheCysAlaThrArgAsnACGTGGAAAAATGCAACCATTCCCCTGTTTTGTGCAACCAGAAAT       100         .         .         .ArgAspThrTrpGlyThrIleGlnCysLeuProAspAsnAspAspAGGGATACTTGGGGAACCATACAGTGCTTGCCTGACAATGATGAT    .         .         .         .         .TyrGlnGluIleThrLeuAsnValThrGluAlaPheAspAlaTrpTATCAGGAAATAACTTTGAATGTAACAGAGGCTTTTGATGCATGG         .       200         .         .AsnAsnThrValThrGluGluAlaIleGluAspValTrpHisLeuAATAATACAGTAACAGAACAAGCAATAGAAGATGTCTGGCATCTA    .         .         .         .         .PheGluThrSerIleLysProCysValLysLeuThrProLeuCysTTCGAGACATCAATAAAACCATGTGTCAAACTAACACCTTTATGT         .         .       300         .ValAlaMetLysCysSerSerThrGluSerSerThrGlyAsnAsnGTAGCAATGAAATGCAGCAGCACAGAGAGCAGCACAGGGAACAAC    .         .         .         .         .ThrThrSerLysSerThrSerThrThrThrThrThrProThrAspACAACCTCAAAGAGCACAAGCACAACGACAACCACACCCAGAGAC         .         .         .       400GlnGluGlnGluIleSerGluAspThrProCysAlaArgAlaAspCAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGAC    .         .         .         .         .AsnCysSerGlyLeuGlyGluGluGluThrIleAsnCysGlnPheAACTGCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTC         .         .         .         .AsnMetThrGlyLeuGluARgAspLysLysLysGlnTyrAsnGluAATATGACAGGATTAGAAAGAGATAAGAAAAAAACAGTATAATGAA  500         .         .         .         .ThrTrpTyrSerLysAspValValCysGluThrAsnAsnSerThrACATGGTACTCAAAAGATGTGGTTTGTGAGACAAATAATAGCACA         .         .         .         .AsnGlnThrGlnCysTyrMetAsnHisCysAsnThrSerValIleAATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATC    .       600         .         .         .ThrGluSerCysAspLysHisTyrTrpAspAlaIleArgPheArgACAGAATCATGTGACAAGCACTATTGGGATGCTATAAGGTTTAGA         .         .         .         .TyrCysAlaProProGlyTyrAlaLeuLeuArgCysAsnAspThrTACTGTGCACCACGGGTTATGCCCTATTAAGATGTAATGATACC    .        .       700         .         .AsnTyrSerGlyPheAlaProAsncysSerLysValValAlsSerAATTATTCAGGCTTTGCACCCAACTGTTCTAAAGTAGTAGCTTCT         .         .         .         .ThrCysThrArgMetMetGluThrGlnThrSerThrTrpPheGlyACATGCACCAGGATGATGGAAACGCAAACTTCCACATGGTTTGGC    .         .         .       800        .PheAsnGlyThrArgAlaGluAsnArgThrTyrIleTyrTrpHisTTTAATGGCACTAGAGCAGAGAATAGAACATATATCTATTGGCAT         .         .         .         .GlyArgAspAsnArgThrIleIleSerLeuAsnLysTyrTyrAsnGGCAGAGATAATAGAACTATCATCAGCTTAAACAAATATTATAAT    .         .         .         .       900LeuSerLeuHisCysLysArgProGlyAsnLysThrValLysGlnCTCAGTTTGCATTGTAAGAGGCCAGGGAATAAGACAGTGAAACAA         .         .         .         .IleMetLeuMetSerGlyHisValPheHisSerHisTyrGlnProATAATGCTTATGTCAGGACATGTGTTTCACTCCCACTACCAGCCG    .         .         .         .         .IleAsnLysArgProArgGlnAlaTrpCysTrpPheLysGlyLysATCAATAAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAA      1000         .         .         .TrpLysAspAlaMetGlnGluValLysThrLeuAlaLysHisProTGGAAAGACGCCATGCAGGAGGTGAAGACCCTTGCAAAACATCCC    .         .         .         .         .ArgTyrArgGlyThrAsnAspThrArgAsnIleSerPheAlaAlaAGGTATAGAGGAACCAATGACACAAGGAATATTAGCTTTGCAGCG         .      1100         .         .ProGlyLysGlySerAspProGluValAlaTyrMetTrpThrAsaCCAGGAAAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAAC    .         .         .         .         .CysArgGlyGluPheLeuTyrCysAsnMetThrTrpPheLeuAsnTGCAGAGGAGAGTTTCTCTACTGCAACATGACTTGGTTCCTCAAT        .         .      1200         .TrpIleGluAsnLysThrHisArgAsnTyrAlaProcysHisIleTGGATAGAGAATAAGACACACCGCAATTATGCACCGTGCCATATA    .         .         .         .         .LysGlnIleIleAsnThrTrpHisLysValGlyArgAsnValTyrAAGCAAATAATTAACACATGGCATAAGGTAGGGAGAAATGTATAT         .         .         .      1300LeuProProArgGluGlyGluLeuSerCysAsnSerThrValThrTTGCCTCCCAGGGAACGGGAGCTGTCCTGCAACTCAACAGTAACC    .         .         .         .         .SerIleIleAlaAsnIleAspTrpGlnAsnAsnAsnGlnThrAsnAGCATAATTGCTAACATTGACTGGCAAAACAATAATCAGACAAC         .         .         .         .IleThrPheSerAlaGluValAlaGluLeuTyrArgLeuGluLeuATTACCTTTAGTGCAGAGGTGGCAGAACTATACAGATTGGAGTTG 1400         .         .         .         .GlyAspTyrLysLeuValGluIleThrProIleGlyPheAlaProGCAGATTATAAATTGGTAGAAATAACACCAATTGGCTTCGCACCTThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArgACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGA    .      1500         .         .         .GlyValPheValLeuGlyPheLeuGlyPheLeuAlaThrAlaGlyGGTGTGTTCGTGCTAGGGTTCTTGGGTTTTCTCGCAACAGCAGGT         .         .         .         .SerAlaMetGlyAlaArgAlaSerLeuThrValSerAlaGluSerTCTGCAATGGGCGCTCGAGCGTCCCTGACCGTGTCGGCTCAGTCC    .         .      1600         .         .ArgThrLeuLeuAlaGlyIleValGlnGlnGluGlnGluLeuLeuCGGACTTTACTGGCCGGGATAGTGCAGCAACAGCAACAGCTGTTG         .         .         .         .AspValValLysArgGlnGlnGluLeuLeuArgLeuThrValTrpGACGTGGTCAAGAGACAACAAGAACTGTTGCGACTGACCGTCTGG    .         .         .      1700         .GlyThrLysAsnLeuGlnAlaArgValThrAlaIleGluLysTyrGGAACGAAAAACCTCCAGGCAAGAGTCACTGCTATAGAGAAGTAC         .         .         .         .LeuGlnAspGlnAlaArgLeuAsnSerTrpGlyCysAlaPheArgCTACAGGACCAGGCGCGGCTAAATTCATGGGGATGTGCGTTTAGA    .         .         .         .      1800GlnValCysHisThrThrValProTrpValAsnAspSerLeuAlaCAAGTCTGCCACACTACTGTACCATGGGTTAATGATTCCTTAGCA         .         .         .         .ProAspTrpAspAsnMetThrTrpGlnGluTrpGluLysGluValCCTGACTGGGACAATATGACGTGGCAGGAATGGGAAAAACAAGTC    .         .         .         .         .ArgTyrLeuGluAlaAsnIleSerLysSerLeuGluGlnAlaGlnCGCTACCTGGAGGCAAATATCAGTAAAAGTTTAGAACAGGCACAA      1900         .         .         .IleGlnGlnGluLysAsnMetTyrGluLeuGlnLysLeuAsnSerATTCAGCAAGAGAAAAATATGTATGAACTACAAAAATTAAATAGC    .         .         .         .         .TrpAspIlePheGlyAsnTrpPheAspLeuThrSerTrpValLysTGGGATATTTTTGGCAATTGGTTTGACTTAACCTCCTGGGTCAAG         .      2000         .         .TyrIleGlnTyrGlyValLeuIleIleValAlaValIleAlaLeuTATATTCAATATGGAGTGCTTATAATAGTAGCAGTAATAGCTTTA    .         .         .         .         .ArgIleValIleTyrValValGlnMetLeuSerArgLeuArgLysAGAATAGTGATATATGTAGTACAAATGTTAAGTAGGCTTAGAAAG         .         .      2100         .GlyTyrArgProValPheSerSerProProGlyTyrIleGln***GGCTATAGGCCTGTTTTCTCTTCCCCCCCCGGTTATATCCAATAGIleHisIleHisLysAspArgGlyGluProAlaAsnGluGluThrATCCATATCCACAAGGACCGGGGACAGCCAGCCAACGAAGAAACA         .         .         .      2200GluGluAspGlyGlySerAsnGlyGlyAspArgTyrTrpProTrpGAAGAAGACGGTGGAAGCAACGGTGGAGACAGATACTGGCCCTGG    .         .         .         .         .ProIleAlaTyrIleHisPheLeuIleArgGlnLeuIleArgLeuGCGATAGCATATATACATTTCCTGATCCGCCAGCTGATTCGCCTC         .         .         .         .LeuThrArgLeuTyrSerIleCysArgAspLeuLeuSerArgSerTTGACCAGACTATACAGCATCTGCAGGGACTTACTATCCAGGAGC 2300         .         .         .         .PheLeuThrLeuGluLeuIleTyrGlnAsnLeuArgAspTrpLeuTTCCTGACCCTCCAACTCATCTACCAGAATCTCAGAGACTGGCTG         .         .         .         .ArgLeuArgThrAlaPheLeuGlnTyrGlyCysGluTrpIleGlnAGACTTAGAACAGCCTTCTTGCAATATGGGTGCGAGTGGATCCAA    .      2400         .         .         .GluAlaPheGlnAlaAlaAlaArgAlaThrArgGluThrLeuAlaGAAGCATTCCAGGCCGCCGCGAGGGCTACAAGAGAGACTCTTGCG         .         .         .         .GlyAlaCysArgGlyLeuTrpArgValLeuGluArgIleGlyArgGGCGCGTGCAGGGGCTTGTGGAGGGTATTGGAACGAATCGGGACG         .         .      2500         .         .GlyIleLeuAlaValProArgARgIleArgGlnGlyAlaGluIleGGAATACTCGCGGTTCCAAGAAGGATCAGACAGGGAGCAGAAATC         .         .         .         .AlaLeuLeu***GlyThrAlaValSerAlaGlyArgLeuTyrGluGCCCTCCTGTGAGGGACGGCAGTATCAGCAGGGAGACTTTATGAA    .         .         .      2600         .TyrSerMetGluGlyProSerSerArgLysGlyGluLysPheValTACTCCATGGAAGGACCCAGCAGCAGAAAGGGAGAAAAATTTGTA         .         .         .         . GlnAlaThrLysTyrGlyCAGGCAACAAAATATGGA     .         .MetGlyAlaArgAsnSerValLeuArgGlyLysLysAlaAspGluATGGGCGCGAGAAACTCCGTCTTGAGAGGGAAAAAAGCAGATGAA         .         .         .         .LeuGluArgIleArgLeuArgProGlyGlyLysLysLysTyrArgTTAGAAAGAATCAGGTTACGGCCCGGCGGAAAGAAAAAGTACAGG    .         .         .         .         .LeuLysHisIleValTrpAlaAlaAsnLysLeuAspArgPheGlyCTAAAACATATTGTGTGGGCAGCGAATAAATTGCACAGATTCGGA       100         .         .         .LeuAlaGluSerLeuLeuGluSerLysGluGlyCysGlnLysIleTTAGCAGAGAGCCTGTTGGAGTCAAAAGAGGGTTGTCAAAAAATT    .         .         .         .LeuThrValLeuAspProMetValProThrGlySerGluAsnLeuCTTACAGTTTTAGATCCAATGGTACCGACAGGTTCAGAAAATTTA         .       200         .         .LysSerLeuPheAsnThrValCysValIleTrpCysIleHisAlaAAAAGTCTTTTTAATACTGTCTGCGTCATTTGGTGCATACAGGCA    .         .         .         .         .GluGluLysValLysAspThrGluGlyAlaLysGlnIleValArgGAAGAGAAAGTGAAAGATACTGAAGGAGCAAAACAAATAGTGCGG         .         .       300         .ArgHisLeuValAlaGluThrGlyThrAlaGluLysMetProSerAGACATCTAGTGGCAGAAACAGGAACTGCAGAGAAAATGCCAAGG    .         .         .         .ThrSerArgProThrAlaProSerSerGluLysGlyGlyAsnTyrACAAGTAGACCAACAGCACCATCTAGCGAGAAGGGAGGAAATTAC        .         .         .       400ProValGlnHisValGlyGlyAsnTyrThrHisIleProLeuSerCCAGTGCAACATGTAGGCGGCAACTACACCCATATACCGCTGACT    .         .         .         .         .ProArgThrLeuAsnAlaTrpValLysLeuValGluGluLysLysCCCCGAACCCTAAATGCCTGGGTAAAATTAGTAGAGGAAAAAAAG         .         .         .         .PheGlyAlaGluValValProGlyPheGlnAlaLeuSerGluGlyTTCGGGGCAGAAGTAGTGCCAGGATTTCAGGCACTCTCAGAAGGC  500         .         .         .         .CysThrProTyrAspIleAsnGlnMetLeuAsnCysValGlyAspTGCACGCCCTATGATATCAACCAAATGCTTAATTGTGTGGGCGAC         .         .         .         .HisGlnAlaAlaMetGlnIleIleArgGluIleIleAsnGluGluCATCAAGCAGCCATGCAGATAATCAGGGAGATTATCAATGAGGAA    .       600         .         .         .AlaAlaGluTrpAspValGlnHisProIleProGlyProLeuProGCAGCAGAATGGGATGTGCAACATCCAATACCAGGCCCCTTACCA         .         .         .         .AlaGlyGlnLeuArgGluProArgGlySerAspIleAlaGlyThrGCGGGGCAGCTTAGAGAGCCAAGGGGATCTGACATAGCAGGGACA    .         .       700         .         .ThrSerThrValGluGluGlnIleGlnTrpMetPheArgProGlnACAAGCACAGTAGAAGAACAGATCCAGTGGATGTTTAGGCCAGAAAsnProValProValGlyAsnIleTyrArgArgTrpIleGlnIleAATCCTGTACCAGTAGGAAACATCTATAGAAGATGGATCCAGATA    .         .         .       800         .GlyLeuGlnLysCysValARgMetTyrAsnProThrAsnIleLeuGGATTGCAGAAGTGTGTCAGGATGTACAACCCGACCAACATCCTA         .         .         .         .         .AspIleLysGlnGlyProLysGluProPheGlnSerTyrValAspGACATAAAACAGGGACCAAAGGAGCCGTTCCAAAGCTATGTAGAT    .         .         .         .      900ArgPheTyrLysSerLeuArgAlaGluGlnThrAspProAlaValAGATTCTACAAAAGCTTGAGGGCAGAACAAACAGATCCAGCAGTG         .         .         .         .LysAsnTrpMetThrGlnThrLeuLeuValGlnAsnAlaAsnProAAGAATTGGATGACCCAAACACTGCTAGTACAAAATGCCAACGCA    .         .         .         .         .AspCysLysLeuValLeuLysGlyLeuGlyMetAsnProThrLeuGACTGTAAATTAGTGCTAAAAGGACTAGGGATGAACCCTACCTTA      1000         .         .         .GluGluMetLeuThrAlacysGlnGlyValGlyGlyProGlyGlnGAAGAGATGCTGACCGCCTGTCAGGGGGTAGGTGGGCCAGGCCAG    .         .         .         .         .LysAlaArgLeuMetAlaGluAlaLeuLysGluValIleGlyProAAAGCTAGATTAATGGCAGAGGCCCTGAAAGAGGTCATAGGACCT         .      1100         .         .AlaProIleProPheALaAlaAlaGlnGlnArgLysAlaPheLysGCCCCTATCCCATTCGCAGCAGCCCAGGAGAGAAAGGCATTTAAA    .         .         .         .         .CysTrpAsnCysGlyLysGluGlyHisSerAlaArgGlnCysArgTGCTGGAACTGTGGAAAGGAAGGGCACTCGGCAAGACAATGCCGA         .         .       1200         .AlaProArgARgGlnGlyCysTrpLysCysGlyLysProGlyHisGCACCTAGAAGGCAGGGCTGCTGGAAGTGTGGTAAGCCAGGACAC    .         .         .         .         .IleMetThrAsnCysProAspArgGlnAlaGlyPheLeuGlyLeuATCATGACAAACTGCCCAGATAGACAGGCAGGTTTTTTAGGACTG         .         .         .       1300GlyProTrpGlyLysLysProArgAsnPheProValAlaGlnValGGCCCTTGGGGAAAGAAGCCCCGCAACTTCCCCGTGGCCCAAGTT    .         .         .         .         .ProGlnGlyLeuThrProThrAlaProProValAspProAlaValCCGCAGGGGCTGACACCAACAGCACCCCCAGTGGATCCAGCAGTG         .         .         .         .AspLeuLeuGluLysTyrMetGlnGlnGlyLysArgGlnArgGluGATCTACTGGAGAAATATATGCAGCAAGGGAAAAGACAGAGAGAG 1400         .         .         .         .GlnArgGluArgProTyrLysGluValThrGluAspLeuLeuHisCAGAGAGAGAGACCATACAAGGAAGTGACAGAGGACTTACTGCAC         .         .         .         .LeuGluGlnGlyGluThrProTyrArgGluProProThrGluAspCTCGAGCAGGGGGAGACACCATACAGGGAGCCACCAACAGAGGAC    .       1500         .         .         .LeuLeuHisLeuAsnSerLeuPheGlyLysAspGlnTTGCTGCACCTCAATTCTCTCTTTGGAAAAGACCAG          .         .         .

Example 6 Peptide Sequences Encoded by the ENV and GAG Genes

[0086] The following coding regions for antigenic peptides, identifiedfor convenience only by the nucleotide numbers of Example 5, within theenv and gag gene regions are of particular interest. envl (1732-1809)                      ArgValThrAlaIleGluLysTyr                      AGAGTCACTGCTATAGAGAAGTAC                              .         .LeuGluAspGlnAlaArgLeuAsnSerTrpGlyCysAlaPheArgCTACAGGACCAGGCGCGGCTAAATTCATGGGGATGTGCGTTTAGA    .         .         .         .      1    GlnValCys CAAGTCTGC env2(1912-1983)                       SerLysSerLeuGluGlnAlaGln                      AGTAAAAGTTTAGAACAGGCACAA                              .         .IleGlnGlnGluLysAsnMetTyrGluLeuGlnLysLeuAsnSerATTCAGCAAGAGAAAAATATGTATGAACTACAAAAATTAAATAGC 1940          .         .         .         . Trp TGG env3 (1482-1530)Pro ThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArg CCTACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGA        .      1500         .         .         . env4 (55-129)         CysThrGlnTyrValThrValPheTyrGlyValPro         TGCACCCAATATGTAACTGTTTTCTATGGCGTACCC               .         .         .         .ThrTrpLysAsnAlaThrIleProLeuPheCysAlaThrACGTGGAAAAATGCAACCATTCCCCTCTTTTGTGCAACC       100         .         .env5 (175-231)                                         AspAsp                                        GATGAT                                             .TyrGluGluIleThrLeuAsnValThrGluAlaPheAspAlaTrpTATCAGGAAATAACTTTGAATGTAACAGAGGCTTTTGATGCATGG        .       200         .         .      AsnAsn AATAAT env6(274-330)    GluThrSerIleLysProCysValLysLeuThrProLeuCys   GAGACATCAATAAACCATGTGTGAAACTAACACCTTTATGT         .         .       300         . ValAlaMetLysCys GTAGCAATGAAATGC    .         . env7 (607-660)                     AsnHisCysAsnThrSerValIle                     AACCATTGCAACACATCAGTCATC                      610         .         .ThrGluSerCysAspLysHisTyrTrpAsp ACAGAATCATGTGACAAGCACTATTGGGAT         .         .         . env8 (661-720)                              AlaIleArgPheArg                              GCTATAAGGTTTAGA                                        .TyrCysAlaProProGlyTyrAlaLeuLeuArgCysAsnAspThrTACTGTGCACCACCGGGTTATGCCCTATTAAGATGTAATGATACC    .         .       700         .        . env9 (997-1044)      LysArgProArgGlnAlaTrpCysTrpPheLysGlyLys      AAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAA      1000         .         .          . TrpLysAsp TGGAAAGAC env10(1132-1215)       LysGlySerAspProGluValAlaTyrMetTrpThrAsa      AAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAAC             .       .         .         .CysArgGlyGluPheLeuTyrCysAsnMetThrTrpPheLeuAsnTGCAGAGGACACTTTTFTFTACTGCAACATGACTTGGTTCCTCAAT         .         .       1200        . env11 (1237-1305)                     ArgAsnTyrAlaProCysHisIle                     CGCAATTATGCACCGTGCCATATA                        .         .         .LysGlnIleIleAsnThrTrpHisLysValGlyArgAsnValTyrAAGCAAATAATTAACACATGGCATAAGGTAGGGAGAAATGTATAT         .         .         .      1300 GAG1 (991-1053)AspCysLysLeuValLeuLysGlyLeuGlyMetAsnProThrLeuGACTGTAAATTAGTGCTAAAAGGACTAGGGATGAACCCTACCTTA      1000         .         .         . GluGluMetLeuThrAlaGAAGAGATGCTGACCGCC

[0087] Of the foregoing peptides, env1, env2, env3 and gag1 areparticularly contemnplated for diagnostic purposes, and env4, env5,env6, env7, env8, env9, env10 and env11 are particularly, contemplatedas protecting agents. These peptides have been selected in part becauseof their sequence homology to certain of the envelope and gag proteinproducts of other of the retroviruses in the HIV group. For vaccinatingpurposes, the foregoing peptides may be coupled to a carrier protein byutilizing suitable and well known techniques to enhance the host'simmune response. Adjuvants such as calcium phosphate or alum hydroxidemay also be added. The foregoing peptides can be synthesized byconventional protein synthesis techniques, such as that of Merrifield.

[0088] It will be apparent to those skilled in the art that variousmodifications and variations can be made in the processes and productsof the present invention. Thus, it is intended that the presentapplication cover the modifications and variations of this inventionprovided they come within the scope of the appended claims and theirequivalents. For convenience in interpreting the following claims, thefollowing table sets forth the correspondence between codon codes andamino acids and the correspondence between three-letter and one-letteramino acid symbols.            DNA CODON      AMINO ACID 3 LET.  AMINOACID 1 LET.-------------------------------------------------------------- :   :\2:  T   C   A   G  :  T   C   A   G  :  T   C   A   G  : : 1 :3\:                 :                 :                 :-------------------------------------------------------------- :   : T :TTT TCT TAT TGT : PHE SER TYR CYS :  F   S   Y   C  : : T : C : TTC TCCTAC TGC : PHE SER TYR CYS :  F   S   Y   C  : :   : A : TTA TCA TAA TGA: LEU SER *** *** :  L   S   *   *  : :   : G : TTG TCG TAG TCG : LEUSER III TRP :  L   S   *   W  :-------------------------------------------------------------- :   : T :CTT CCT CAT CGT : LEU PRO HIS ARG :  L   P   H   R : : C : C : CTC CCCCAC CGC : LEU PRO HIS ARG :  L   P   H   R  : :   : A : CTA CCA CAA CGA: LEU PRO GLN ARG :  L   P   Q   R  : :   : G : CTG CCG CAG CCG : LEUPRO GLN ARG :  L   P   Q   R  :-------------------------------------------------------------- :   : T :ATT ACT AAT AGT : ILE THR ASN SER :  I   T   N   S  : : A : C : ATC ACCAAC AGC : ILE THR ASN SER :  I   T   N   S  : :   : A : ATA ACA AAA AGA: ILE THR LYS ARG :  I   T   K   R  : :   : G : ATG ACG AAG AGG : METTHR LYS ARG :  M   T   K   R  :-------------------------------------------------------------- :   : T :GTT GCT GAT GGT : VAL ALA ASP GLY :  V   A   D   G  : : G : C : GTC GCCGAC GGC : VAL ALA ASP GLY :  V   A   D   G  : :   : A : GTA GCA GAA GGA: VAL ALA GLY GLY :  V   A   E   G  : :   : G : GTG GCG GAG GGG : VALALA GLU GLY :  V   A   E   G  :-------------------------------------------------------------- 3 Letter1 Letter CODONS ALA A GCT GCC CGA GCG ARG R CGT CGC CGA CGG AGA AGG ASNN AAT AAC ASP O GAT GAC CYS C TGT TGC GLN Q CAA CAG GLU E GAA CAG GLY GGGT GGC CGA CCG HIS H CAT CAC ILE I ATT ATC ATA LEU L CTT CTC CTA CTGTTA TTG LYS K AAA AAG MET M ATG PHE F TTT TTC PRO P CCT CCC CCA CCG SERS TCT TCC TCA TCG AGT AGC THR T ACT ACC ACA ACG TRP W TGG TYR Y TAT TACVAL V GTT GTC GTA GTG *** * TAA TAG TGA

What is claimed is:
 1. A method for diagnosing an HIV-2 infection whichcomprises: (a) contacting genetic DNA or RNA from a body sample obtainedfrom a person suspected of having an HIV-2 infection with a DNA probederived from at least a portion of the genome of the HIV-2 virus; and(b) determining whether a hybridized complex is created.
 2. The methodof claim 1 wherein said body sample is selected from the groupconsisting of tissue, blood cells, cells and body fluids.
 3. The methodof claim 1 wherein the presence of the hybridized complex is determinedby a process selected from the group consisting of Southern blot,Northern blot and dot blot.
 4. The method of claim 1 wherein the cDNAprobe is analogous to the entire genome of the HIV-2 virus.
 5. A DNAprobe capable of hybridizing to the entire genome of the HIV-2 virus. 6.A method for diagnosing an HIV-2 infection which comprises: (a)contacting sera obtained from a patient suspected of having an HIV-2infection with a polypeptide expression product of a DNA segment derivedfrom the genome of the HIV-2 virus; and (b) determing whether animmunocomplex is formed.
 7. The method of claim 6 wherein the formationof the immunocomplex is determined by a process selected from the groupconsisting of radioimmunoassays (RIA), radioimmunoprecipitation assays(RIPA), immunofluoresence assays (IFA), enzyme-linked immunosorbentassays (ELISA) and Western blots.
 8. A process for detecting thepresence of a virus selected from the group consisting of LAV-II, HIV-2,STLV-III and other viruses which form complexes with LAV-II reagentscomprising: (a) contacting DNA or RNA from a sample suspected ofcontaining viral genetic material with a DNA probe derived from aportion of the genome of the HIV-2 virus; and (b) determining whether ahybridized complex is created.
 9. A peptide selected from the groupconsisting of env1, env2, env3, env4, env5, env6, env7, env8, env9,env10, env11 and gag1.
 10. A kit for diagnosing an HIV-2 infection bythe method of claim 6 and comprising env1, env2, env3 and gag1 peptidesas the polypeptide expression product.
 11. A vaccinating agentcomprising at least one peptide selected from the group consisting ofenv4, env5, env6, env7, env8, env9, env10 and env11 in admixture withsuitable carriers.
 12. A peptide having common immunological propertieswith the peptide structure of the envelope glycoprotein of a virus ofthe HIV-2 class, said peptide having no more than 40 amino acidresidues.
 13. A peptide according to claim 12 having either of thefollowing formulas: XR--A-E-D-YL-DQ--L-WGC-----CZ    XA-E-D-YL-DZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of either of thefollowing peptide sequences: RVTAIEKYLQDQARLNSWGCAFRQVC    AIEKYLQDQ


14. A peptide according to claim 12 having either of the followingformulas: X--E--Q-QQEKN--EL--L---Z      XQ-QQEKNZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of either of thefollowing peptide sequences: SLEQAQIQQEKNMYELQKLNSW      QIQQEKN


15. A peptide according to claim 12 characterized as having either ofthe following formulas XEL--YK-V-I-P-G-APTK-KR-----Z    XYK-V-I-P-G-APTK-KRZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of either of thefollowing peptide sequences: ELGDYKLVEITPIGFAPTKEKRYSSAH    YKLVEITPIGFAPTKEK


16. A peptide according to claim 12 characterized as having either ofthe following formulas: X----VTV-YGVP-WK-AT--LPCA-Z     XVTV-YGVP-WK-ATZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: CTQYVTVFYGVPTWKNATIPLFCAT    VTVFYGVPTWKNAT EKLWVTVYYGVPVWKEATTTLFCAS     VTVYYGVPVWKEAT


17. A peptide according to claim 16 characterized as having one of thefollowing formulas: CTQYVTVFYGVPTWKNATIPLFCAT     VTVFYGVPTWKNATEKLWVTVYYGVPVWKEATTTLFCAS     VTVYYGVPVWKEAT EDLWVTVYYGVPVWKEATTTLFCAS    VTVYYGVPVWKEAT DNLWVTVYYGVPVWKEATTTLFCAS     VTVYYGVPVWKEAT


18. A peptide according to claim 12 characterized as having either ofthe following formulas: X---QE--L-NVTE-F--W-NZ         XL-NVTE-FZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: DDYQEITL-NVTEAFDAWNN        L-NVTEPNPQEVVLVNVTENFNMWKN        LVNVTE


19. A peptide according to claim 18 characterized as having one of thefollowing formulas: DDYQEITL-NVTEAFDAWNN        L-NVTEAFPNPQEVVLVNVTENFNMWKN        LVNVTENF PNPQEIELENVTEGFNMWKN       LENVTEGF PNPQEIALENVTENFNMWKN        LENVTENF


20. A peptide according to claim 12 characterized as having one of thefollowing formulas: XL---S-KPCVKLTPLCV--KZ       XKPCVKLTPLCVZ    XS-KPCVKLTPLCVZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: ETSIKPCVKLTPLCVAMK DQSLKPCVKLTPLCVSLK    KPCVKLTPLCV   SLKPCVKLTPLCV


21. A peptide according to claim 20 characterized as having one of thefollowing formulas: ETSIKPCVKLTPLCVAMK DQSLKPCVKLTPLCVSLKDQSLKPCVKLTPLCVTLN      PCVKLTPLC


22. A peptide characterized as having either of the following formulas:X---N-S-IT--C-Z    XN-S-ITZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: NHCNTSVITESCD    NTSVIT TSCNTSVITQACP   NTSAIT


23. A peptide according to claim 22 characterized as having one of thefollowing formulas: NHCNTSVITESCD    NTSVIT TSCNTSVITQACP    NTSVITINCNTSVITQACP    NTSVIT INCNTSAITQACP    NTSAIT


24. A peptide according to claim according to claim 12 characterized ashaving the following formula: XYC-P-G-A-L-C-N-TZ in which X and Z are OHor NH₂ or, to the extent that the immunological properties of thenatural peptides lacking these groups shall not be essentially modified,the groups having from one to five amino acid residues, and each of thehyphens corresponding to an aminoacyl residue chosen from those whichpermit the conservation for the peptide characterized above of theimmunological properties of either of the following peptide sequences:YCAPPGYALLRC-NDT YCAPAGFAILKCNNKT


25. A peptide according to claim 24 characterized as having one of thefollowing formulas: YCAPPGYALLRC-NDT YCAPAGFAILKCNNKT

YCAPAGFAILKCNDKK YCAPAGFAILKCRDKK


26. A peptide according to claim 12 characterized as having thefollowing formula: X------A-C------W--Z in which X and Z are OH or NH₂or, to the extent that the immunological properties of the naturalpeptides lacking these groups shall not be essentially modified, thegroups having from one to five amino acid residues, and each of thehyphens corresponding to an aminoacyl residue chosen from those whichpermit the conservation for the peptide characterized above of theimmunological properties of either of the following peptide sequences:NKRPRQAWCWFKG-KWKD N--MRQAHCNISRAKWNA


27. A peptide according to claim 26 characterized as having one of thefollowing formulas: NKRPRQAWCWFKG-KWKT N--MRQAHCNISRAKWNAD--IRRAYCTINETEWDK I--IGQAHCNISRAQWSK


28. A peptide according to claim 12 characterized as having either ofthe following formulas: X-G-DPE------NC-GEF-YCN-----NZ            XNC-GEF-YCNZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: KGSDPEVAYMWTNCRGEFLYCNMTWFLN            NCRGEFLYCN -GGDPEIVTHSFNCGGEFFYCNSTQLFN            NCGGEFFYCN


29. A peptide according to claim 28 characterized as having one of thefollowing formulas: KGSDPEVAYMWTNCRGEFLYCNMTWFLN             NCRGEFLYCN-GGDPEIVTHSFNCGGEFFYCNSTQLFN             NCGGEFFYCN-GGDPEITTHSFNCRGEFFYCNTSKLFN             NCRGEFFYCN-GGDPEITTHSFNCGGEFFYCNTSGLFN             NCGGEFFYCN


30. A peptide according to claim 12 characterized as having either ofthe following formulas: X-----C-IKQ-I------G---YZ      XC-IKQ-IZ

in which X and Z are OH or NH₂ or, to the extent that the immunologicalproperties of the natural peptides lacking these groups shall not beessentially modified, the groups having from one to five amino acidresidues, and each of the hyphens corresponding to an aminoacyl residuechosen from those which permit the conservation for the peptidecharacterized above of the immunological properties of one of thefollowing peptide sequences: RNYAPCHIKQIINTWHKVGRNVY      CHIKQIITITLPCRIKQFINMWQEVGKAMY      CRIKQFI


31. A peptide according to claim 30 characterized as having one of thefollowing formulas: RNYAPCHIKQIINTWHKVGRNVY      CHIKQIITITLPCRIKQFINMWQEVGKAMY      CRIKQFI SITLPCRIKQIINMWQKTCKAMY     CRIKQII NITLQCRIKQIIKMVAGR-KAIY      CRIKQII


32. The antigenic peptide gag1 characterized as having the followingformula: XNCKLVLKGLGMNPTLEEMLTAZ in which X and Z are OH or NH₂ or, tothe extent that the immunological properties of the natural peptideslacking these groups shall not be essentially modified, the groupshaving from one to five amino acid residues, and each of the hyphenscorresponding to an aminoacyl residue chosen from those which permit theconservation for the peptide characterized above of the immunologicalproperties of the following peptide sequence: XNCKLVLKGLGMNPTLEEMLTA 33.An antigenic composition containing at least one gag1 peptide accordingto claim 32 or at least an oligomer of this peptide, characterized ashaving the capacity to be recognized by human biological fluids such asserum containing anti-HIV-2 antibodies and under appropriate conditionsanti-HIV-1 antibodies.
 34. An antigenic composition containing at leastone peptide according to claims 13, 14 or 15, or at least an oligomer ofthe peptide, characterized in that the peptide specifically recognizesthe presence of anti-HIV-2 antibodies.
 35. An immunogenic compositioncontaining at least one peptide according to any one of the claims 16-31or at least an oligomer of the peptide or the peptide conjugated with acarrier molecule, in association with an acceptable pharmaceuticalvehicle for the production of vaccines, the composition characterized inthat it induces antibody production against the peptide in sufficientquantities to form an effective immunocomplex with the entire HIV-2retrovirus and its corresponding proteins.
 36. An immunogeniccomposition according to claim 35 further comprising peptides havingformulas corresponding to the envelope glycloprotein sequences of HIV-1and HIV-2 which have an amino acid homology greater than 50%.
 37. Animmunogenic composition according to either of claims 35 or 36 having atleast one peptide or at least an oligomer of the peptide or the peptideconjugated with a carrier molecule, the composition coresponding to apeptide chosen from the group consisting of Env4, Env5, Env6 and Env10.38. A procedure for the in vitro diagnosis of HIV-2 infections in abiological fluid, comprising: contacting the biological fluid with atleast one peptide according to claims 12, 13, 14, 15 or 32, or aconjugate of the peptide with a carrier molecule; detecting the eventualpresence in the biological fluid of an antigen-antibody complex byphysical or chemical methods.
 39. The diagnostic procedure of claim 38,wherein the detection step is performed by a test selected by the groupconsisting of enzyme-linked immuno absorbent assay (ELISA),immunofluoresence assay (IFA), radioimmunoassay (RIA), andradioimmunoprecipitation assay (RIPIA).
 40. A kit for the in vitrodiagnosis of an HIV-2 infection in a biological fluid comprising: apeptide composition containing a peptide according to claims 12, 13, 14,15 or 32, or a mixture of such peptides, or a conjugate of such peptideswith a carrier molecule; an appropriate reaction environment for theproduction of an antigen-antibody complex; one or more reagents adaptedfor the detection of the formation of antigen-antibody complexes; and abiological fluid as a reference sample having no antibodies recognizedby said peptide composition.
 41. An protein selected from the groupdescribed in Example 4 consisting of p 16, p 26, p 12, polymerase, Qprotein, R protein, X protein, Y protein, env protein, F protein, TAT,ART, U5 and U3.
 42. A kit for diagnosing an HIV-2 infection by themethod of claim 6 and comprising as the polypeptide expression product aprotein of claim
 41. 43. A vaccinating agent comprising at least oneprotein of claim 41 in association with appropriate carriers.